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PROMOTER SEQUENCES n 
TECHNICAL FIELD 

The present invention relates an isolated promoter region of the mammalian transcription 
5 factOT FOXC2. The invention also relates to screening methods for agents modulating the 
expression of FOXC2 and thereby being potentially useful for the treatment of medical 
conditions related to obesity. The invention further relates to a previously unknown variant 
of the human FOXC2 gene, derived via the use of an alternative promoter, which produces 
an additional exon that generates a distinct open reading frame via splicing. The alternative 
10 gene encodes a variant of the FOXC2 transcription factor, which is lacking a part of the 
DNA-binding domain and consequently has a potential regulatory function. 



BACKGROUND ART 

15 More than half of the men and women in the United States, 30 years of age and older, are 
now considered overweight, and nearly one-quarter are clinically obese. This high 
prevalence has led to increases in the medical conditions that often accompany obesity, 
especially non-insulin dependent diabetes mellitus (NIDDM), hypertension, cardiovascular 
disorders, and certain cancers. Obesity results from a chronic imbalance between energy 

20 intake (feeding) and energy expenditure. To better understand the mechanisms that lead to 
obesity and to develop strategies in certain patient populations to control obesity, there is a 
need to develop a better underlying knowledge of the molecular events that regulate the 
differentiation of preadipocytes and stem cells to adipocytes, the major component of 
adipose tissue. 

25 

The helix-loop-helix (HLH) family of transcriptional regulatory proteins are key players in 
a wide array of developmental processes (for a review, see Massari & Murre (2000) Mol. 
Cell. Biol. 20: 429-440). Over 240 HLH proteins have been identified to date in organisms 
ranging from the yeast Saccharomyces cerevisiae to humans. Studies in Xenopus laevis, 
30 Drosophila melanogaster, and mice have convincingly demonstrated that HLH proteins are 
intimately involved in developmental events such as cellular differentiation, lineage 
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commitment, and sex determination. In multicellular organisms, HLH factors are required 
for a multitude of important developmental processes, including neurogenesis, myogenesis, 
hematopoiesis, and pancreatic development. 

5 The winged helix / forkhead class of transcription factors is characterized by a 100-amino 
acid, monomeric DNA-binding domain. X-ray crystallography of the forkhead domain 
from HNF-3y has revealed a three-dimensional structure, the "winged helix", in which two 
loops (wings) are connected on the C-terminal side of the helix-loop-helix (for reviews, see 
Brennan, R.G. (1993) Cell 74: 773-776; and Lai, E. et aL (1993) Proc. NatL Acad. Sci. 

10 U.S.A.90: 10421-10423). 

The isolation of the mouse mesenchyme forkhead-1 (MFH-1) and the corresponding 
human (FKHL14) chromosomal genes is disclosed by Miura, N. et al. (1993) FEBS letters 
326: 171-176; and (1997) Genomics 41: 489-492. The nucleotide sequences of the mouse 
15 MFH-1 gene and the human FKHL14 gene have been deposited with the EMBL/GenBank 
Data Libraries under accession Nos. Y08222 (SEQ ID NO: 5) and Y08223 (SEQ ID NO: 
8), respectively. A corresponding gene has been identified in Gallus gallus (GenBank 
accession numbers U37273 and U95823). 

20 The International Patent Application WO 98/54216 discloses a gene encoding a Forkhead- 
Related Activator (FREAC)-1 1 (also known as SI 2), which is identical with the 
polypeptide encoded by the human FKHL14 gene disclosed by Miura, supra. This 
transcription factor is expressed in adipose tissue and involved in lipid metabolism and 
adipocyte differentiation (cf. Swedish patent application No. 0000531-4, filed February 18, 

25 2000). 

The nomenclature for the winged helix / forkhead transcription factors has been 
standardized and Fox (Forkhead Box) has been adopted as the unified symbol (Kaestner et 
al. (2000) Genes & Development 14: 142-146; see also htpp://www. biology.pomona.edu/ 
30 fox). It has been agreed that the genes previously designated MFH- 1 and FKHL14 (as well 
as FREAC-1 1 and SI 2) should be designated FOXC2. 



BRIEF DESCRIPTION OF THE DRAWINGS 



Figure 1 shows the general structure of the human FOXC2 gene. 

Figure 2 illustrates the results from phylogenetic footprinting experiments. Shown is the 
fraction conserved (1 .0 = 100%) between mouse FoxC2 and human FOXC2 sequences in 
the alignment generated with Clustal. Solid (bold) line indicates the fraction of the human 
sequence which is identical to the mouse within a 200 bp "window" over the human 
sequence in the alignment. The weak (dotted) line is set to -0.05 when the sliding window 
contains human exon sequence and to -0. 1 when the window is entirely composed of exon 
sequence. Regions containing local maxima or exceeding a conservation fraction of 0.7 are 
likely to be functional and are classified as "predicted regulatory regions". 

Figure 3 illustrates the predicted "enhancer" region in the human FOXC2 gene. Underlined 
sequences indicate likely transcription factor binding sites. Boxed sequence indicates exon 
sequence. 

Splice = sequence predicted as splice site in the alternatively spliced gene; 
E-box-like — sequence resembling the "E-box" motif CANNTG known as a target for DNA 
binding proteins containing a helix-loop-helix domain (often associated with the activation 
of cell-type specific gene transcription during tissue differentiation; see Massari & Murre 
(2000) Mol. Cell. Biol. 20: 429-440) 

Forkhead-Wce = sequence resembling binding site for the winged helix / forkhead class of 
transcription factors; 

Ets-like - sequence resembling consensus binding site for ETS-domain transcription factor 
family (see Sharrocks et aL (1997) Int. J. Biochem. Cell Biol. 29, 1371-1387). 

Figure 4 illustrates the predicted "promoter" region in the human FOXC2 gene. Underlined 
sequence indicates exon sequences. Boxed sequences indicate conserved block (potential 
transcription factor binding sites). 



DESCRIPTION OF THE INVENTION 



According to the present invention, the partially known sequence (SEQ ID NO: 8) of 
human FOXC2 gene has been extended. In the previously unknown region of the gene, 
differentially conserved regions, consistent with regulatory function, have been identified. 
Further, an alternative transcript has been identified, which includes the use of at least two 
exons. The putative regulatory enhancer is immediately adjacent to the newly discovered 
alternative exon, suggesting that it may play a role in the alternative selection of transcript 
classes. 

Modulation of the FOXC2 regulation is expected to have therapeutic value in type II 
diabetes; obesity, hypercholesterolemia, and other cardiovascular diseases or 
dyslipidemias. 

Consequently, in a first aspect this invention provides a human FOXC2 promoter region 
comprising a sequence selected from: 

(a) the nucleotide sequence set forth as positions 1250 to 2235, such as positions 
1250 to 1749 or positions 1692 to 1703, in SEQ ID NO: 1, or a fragment thereof 
exhibiting FOXC2 promoter activity; 

(b) the complementary strand of (a); and 

(c) nucleotide sequences capable of hybridizing, under stringent hybridization 
conditions, to a nucleotide sequence as defined in (a) or (b). 

Another aspect of the invention is a recombinant construct comprising the human FOXC2 
promoter region as defined above. In the said recombinant construct, the human FOXC2 
promoter region can be operably linked to a gene encoding a detectable product, such as 
the human FOXC2 gene, or a reporter gene. The term "operably linked" as used herein 
means functionally fusing a promoter with a structural gene in the proper frame to express 
the structural gene under control of the promoter. As used herein, the term "reporter gene" 
means a gene encoding a gene product that can be identified using simple, inexpensive 
methods or reagents and that can be operably linked to the human FOXC2 promoter region 
or an active fragment thereof. Reporter genes such as, for example, a luciferase, p- 
galactosidase, alkaline phosphatase, or green fluorescent protein reporter gene, can be used 
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to determine transcriptional activity in screening assays according to the invention (see, for 
example, Goeddel (ed.), Methods Enzymol., Vol. 185, San Diego: Academic Press, Inc. 
(1990); see also Sambrook, supra). 

5 The invention also provides a vector comprising the recombinant construct as defined 
above, as well as a host cell stably transformed with such a vector, or generally with the 
recombinant construct according to the invention. The term 'Vector" refers to any carrier of 
exogenous DNA that is useful for transferring the DNA to a host cell for replication and/or 
appropriate expression of the exogenous DNA by the host cell. 

10 

In another aspect, the invention provides a method for identification of an agent regulating 
FOXC2 promoter activity, said method comprising the steps: (i) contacting a candidate 
agent with a human FOXC2 promoter region as defined above; and (ii) determining 
whether said candidate agent modulates expression of the FOXC2 gene, such modulation 
is being indicative for an agent capable of regulating FOXC2 promoter activity. As used 
herein, the term "agent" means a biological or chemical compound such as a simple or 
complex organic molecule, a peptide, a protein or an oligonucleotide. 

A transfection assay can be a particularly useful screening assay for identifying an 
20 effective agent modulating and/or regulating FOXC2 promoter activity. In a transfection 
assay, a nucleic acid containing a gene, e.g. a reporter gene, operably linked to a human 
FOXC2 promoter or an active fragment thereof, is transfected into the desired cell type. A 
test level of reporter gene expression is assayed in the presence of a candidate agent and 
compared to a control level of expression. An effective agent is identified as an agent that 
25 results in a test level of expression that is different than a control level of reporter gene 
expression, which is the level of expression determined in the absence of the agent. 
Methods for transfecting cells and a variety of convenient reporter genes are well known in 
the art (see, for example, Goeddel (ed.), Methods Enzymol., Vol. 185, San Diego: 
Academic Press, Inc. (1990); see also Sambrook, supra). Consequently, the said method 
30 could e.g. comprising assaying reporter gene expression in a host cell, stably transformed 
with a recombinant construct comprising the human FOXC2 promoter, in the presence and 
absence of a candidate agent, wherein an effect on the test level of expression as compared 
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to control level of expression is indicative of an agent capable of regulating FOXC2 
promoter activity. 

Methods for identification of polypeptides regulating F0XC2 promoter activity could 
5 include various techniques known in the art, such as the yeast one-hybrid system (see: Li & 
Herskowitz (1993) Science 262, 1870-1874) to identify proteins binding specific 
sequences from the FOXC2 regulatory region, biochemical purification of proteins which 
bind to the regulatory region, the use of a "southwestern" cloning strategy (see e.g, Hai et 
al. (1989) Genes & Development 3: 2083-2090) in which a pool of bacteria infected with a 
io "phage library" are induced to express the encoded protein and probed with radioactive 
DNA sequences from the FOXC2 regulatory regions to identify binding proteins. 

In a further aspect, the invention provides a human FOXC2 enhancer region 
comprising a sequence selected from: 
15 (a) the nucleotide sequence set forth as positions 216 to 475, such as positions 223 to 
23 1 , positions 359 to 375, positions 378 to 402, or positions 403 to 423, in SEQ ID 
NO: 1 , or a fragment thereof exhibiting FOXC2 enhancer activity; 

(b) the complementary strand of (a); and 

(c) nucleotide sequences capable of hybridizing, under stringent hybridization 
20 conditions, to a nucleotide sequence as defined in (a) or (b). 

As described above for the human FOXC2 promoter region, the invention further provides 
a recombinant construct comprising a human FOXC2 enhancer region, a vector comprising 
the said recombinant construct, as well as a host cell stably transformed with said vector or 
25 with said recombinant construct. 

Further, the invention provides a method for identification of an agent regulating FOXC2 
enhancer activity, said method comprising the steps: (i) contacting a candidate agent with 
the human FOXC2 enhancer region as defined above; and (ii) determining whether said 
30 candidate agent modulates expression of the FOXC2 gene, such modulation being 
indicative for an agent capable of regulating FOXC2 enhancer activity. It will be 
understood by the skilled person that known steps are available for performing such a 
method. For instance, a "panel" of constructs which include a variety of mutations and 
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deletions can be used in order to associate a response with a specific alteration of a single 
base or subsegment of the regulatory apparatus. A simple panel might include: enhancer 
plus promoter, promoter only, enhancer plus a "minimal" promoter from a distinct gene. 
As mentioned above, a transfection assay, using a host cell stably transformed with a 
5 suitable recombinant construct, can be a particularly useful screening assay for identifying 
an effective agent. 

In yet a further aspect, the invention provides a method for identification of an agent 
capable of regulating a mammalian FOXC2 promoter activity, said method comprising the 
io steps (i) contacting a candidate agent with a murine FoxC2 promoter nucleotide sequence 
shown as positions 216 to 2235, such as positions 216 to 475 or positions 1250 to 2235, in 
SEQ ID NO: 5; and (ii) determining whether said candidate agent modulates expression of 
a mammalian FOXC2 gene, such modulation being indicative for an agent capable of 
regulating mammalian FOXC2 promoter activity. 

15 

In another important aspect, the invention provides an isolated nucleic acid molecule 
selected from: 

(a) nucleic acid molecules comprising a nucleotide sequence as shown in SEQ ID NO: 3; 

(b) nucleic acid molecules comprising a nucleotide sequence capable of hybridizing, under 
20 stringent hybridization conditions, to a nucleotide sequence complementary the 

polypeptide coding region of a nucleic acid molecule as defined in (a) and which codes for 
a variant form of the FOXC2 transcription factor; and 

(c) nucleic acid molecules comprising a nucleic acid sequence which is degenerate as a 
result of the genetic code to a nucleotide sequence as defined in (a) or (b) and which codes 

25 for a variant form of the FOXC2 transcription factor. 

In a preferred form of the invention, the said nucleic acid molecule has a nucleotide 
sequence identical with SEQ ID NO: 3 of the Sequence Listing. However, the nucleic acid 
molecule according to the invention is not to be limited strictly to the sequence shown as 
30 SEQ ID NO: 3. Rather the invention encompasses nucleic acid molecules carrying 
modifications like substitutions, small deletions, insertions or inversions, which 
nevertheless encode proteins having substantially the biochemical activity of the FOXC2 
polypeptide according to the invention. Included in the invention are consequently nucleic 
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acid molecules, the nucleotide sequence of which is at least 90% homologous, preferably 
at least 95% homologous, with the nucleotide sequence shown as SEQ ID NO: 3 in the 
Sequence Listing. 

5 Included in the invention is also a nucleic acid molecule which nucleotide sequence is 
degenerate, because of the genetic code, to the nucleotide sequence shown as SEQ ID NO: 
3. A sequential grouping of three nucleotides, a "codon", codes for one amino acid. Since 
there are 64 possible codons, but only 20 natural amino acids, most amino acids are coded 
for by more than one codon. This natural "degeneracy", or "redundancy", of the genetic 
10 code is well known in the art. It will thus be appreciated that the nucleotide sequence 
shown in the Sequence Listing is only an example within a large but definite group of 
sequences which will encode the variant FOXC2 polypeptide. 

The invention includes an isolated polypeptide encoded by the nucleic acid as defined 
15 above. In a preferred form, the said polypeptide has an amino acid sequence according to 
SEQ ID NO : 4 of the Sequence Listing. However, the polypeptide according to the 
invention is not to be limited strictly to a polypeptide with an amino acid sequence 
identical with SEQ ID NO: 4 in the Sequence Listing. Rather the invention encompasses 
polypeptides carrying modifications like substitutions, small deletions, insertions or 
20 inversions, which polypeptides nevertheless have substantially the biological activities of 
the variant FOXC2 polypeptide. 

A further aspect of the invention is a vector harboring the nucleic acid molecule according 
to the invention. The said vector can e.g* be a replicable expression vector, which carries 
* - * " 25 and is capable of mediating the expression of a DNA molecule according to the invention. 
/ . In the present context the term "replicable" means that the vector is able to replicate in a 

" . / given type of host cell into which is has been introduced. Examples of vectors are viruses 

" \ such as bacteriophages, cosmids, plasmids and other recombination vectors. Nucleic acid 

: molecules are inserted into vector genomes by methods well known in the art. 

30 

. - - \ Included in the invention is also a cultured host cell harboring a vector according to the 

invention. Such a host cell can be a prokaryotic cell, a unicellular eukaryotic cell or a cell 

• • • 

derived from a multicellular organism. The host cell can thus e.g. be a bacterial cell such as 



il 
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an E. coli cell; a cell from yeast such as Saccharomyces cervisiae or Pichia pastoris, or a 
mammalian cell. The methods employed to effect introduction of the vector into the host 
cell are standard methods well known to a person familiar with recombinant DNA 
methods. 

5 

In yet another aspect, the invention includes a method for identifying an agent capable of 
regulating expression of the nucleic acid molecule as defined above, said method 
comprising the steps (i) contacting a candidate agent with the said nucleic acid molecule; 
and (ii) determining whether said candidate agent modulates expression of the said nucleic 
10 acid molecule. 

In another aspect the invention provides an antisense oligonucleotide having a sequence 
capable of specifically hybridizing to RNA transcribed by the alternatively spliced nucleic 
acid molecule shown as SEQ ID NO: 3, so as to prevent translation of the said RNA, 

15 Antisense nucleic acids (preferably 10 to 20 base-pair oligonucleotides) capable of 
specifically binding to control sequences for the alternatively spliced FOXC2 gene are 
introduced into cells, e.g. by a viral vector or colloidal dispersion system such as a 
liposome. The antisense nucleic acid binds to the target nucleotide sequence in the cell and 
prevents transcription and/or translation of the target sequence. Phosphorothioate and 

20 methylphosphonate antisense oligonucleotides are specifically contemplated for 

therapeutic use by the invention. Suppression of expression of the alternatively spliced 
FOXC2 gene, at either the transcriptional or translational level, is useful to generate 
cellular or animal models for diseases/conditions related to lipid metabolism. 

25 In yet another aspect, the invention provides a method for the identification of polypeptides 
which bind to nucleotide sequences involved in the biological pathway regulating lipid 
metabolism and/or adipocyte differentiation, comprising the steps of: 

(a) transfecting a host cell line with a human FOXC2 nucleotide sequence linked to a 
reporter gene, such as a gene encoding Green Fluorescent Protein (GFP) (for a review, see 

30 e.g. Galbraith et al. (1999) Methods in Cell Biology 58: 315-341); 

(b) transfecting the said host cell line with a variety of human cDNA sequences, e.g. 
sequences included in a cDNA library; 
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(c) identifying and isolating cells, e.g. by FACS cells sorting, having an altered level of 
expression of the said reporter gene, which is indicative that the polypeptide encoded by 
the added cDNA up- or downregulates at least one gene involved in the biological pathway 
regulating lipid metabolism and/or adipocyte differentiation; 
5 (d) recovering cDNA from the cells isolated in step (c), by standard procedures, e.g. PCR 
or a CRE-LOX mediated procedure (see e.g. Sauer (1998) Methods 14: 381-392); and 
(e) identifying the polypeptide expressed by the cDNA recovered in step (d), e.g. by 
sequencing the cDNA and comparing the obtained sequence against sequence databases. 

10 Throughout this description the terms "standard protocols" and "standard procedures", 
when used in the context of molecular biology techniques, are to be understood as 
protocols and procedures found in an ordinary laboratory manual such as: Current 
Protocols in Molecular Biology, editors F. Ausubel et al., John Wiley and Sons, Inc. 1994, 
or Sambrook, J., Fritsch, E.F. and Maniatis, T., Molecular Cloning: A laboratory manual, 

1 5 2 nd Ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY 1 989. 



EXAMPLES 

20 Additional features of the invention will be apparent from the following Examples. 
Examples 1 to 4 are actual, while the remaining Examples are prophetic. 

EXAMPLE 1 : Computational identification of FOXC2 genomic sequences 

25 The sequences present in the GenBank database (http://wmv.ncbLnlm.nih.gov) were 
screened for sequence similarity to the human FOXC2 cDNA sequence (GenBank 
accession number NM_00521 (SEQ ID NO: 9)). The BLAST algorithm (Altschul et al. 
(1997) Nucleic Acids Res. 25:3389-3402) was used for determining sequence identity. 
Software for performing BLAST analyses is publicly available through the National Center 

30 for Biotechnology Information (http://www.ncbunlm.riih.gov). A working draft genomic 
sequence in 25 unordered pieces, from the Homo sapiens chromosome 16 clone RP1 1- 
46309 (GenBank accession number AC009108; Version 6; GL7689930; released 4 May 
2000), was selected for further studies. 
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Regions in sequence AC009108 matching portions of the FOXC2 cDNA sequence 
NM_00525 lwere combined using the PHRAP software, developed at the University of 
Washington (http://www.genome.washington.edu/UWGC/amlysistoo Two 
5 contigs of 9780 bp (positions 1 16445 to 126224 in GenBank AC009108.6) and 3784 bp 
(positions 42927 to 46710 in GenBank AC0091 108,6), respectively, were assembled to 
generate a human FOXC2 genomic fragment of 1345 1 bp. 

The ClustalW multiple sequence alignment program, version 1 .8 (Thompson et al. (1994) 
i o Nucleic Acids Research 22: 4673-4680), was then used to identify the human FOXC2 
extended genomic DNA sequence of 6458 bp (SEQ ID NO: 1) by comparison with the 
mouse cDNA sequence X74040 (SEQ ID NO: 6). First, a 6459 bp sequence, corresponding 
to positions 1500-7958 in the 13451 bp sequence, was selected. Positions 1-2285 in this 
6459 bp sequence corresponded to 44426-46710 in AC0091 08.6, while positions 2151- 
1 5 6459 corresponded to positions 1 26224-1 21916 (reverse complement taken) in 

AC009108.6. The overlap of positions 2151-2285 allowed for the contigs to be joined by 
the assembly program. The G residue in position 2655 was considered to be a sequencing 
error and was removed, which resulted in the 6458 bp sequence set forth as SEQ ID NO: 1. 
The open reading frame in SEQ ID NO: 1 encodes a polypeptide (SEQ ID NO: 2) identical 
20 with the known human FOXC2 polypeptide shown as SEQ ID NO: 1 0. 

EXAMPLE 2: Identification of potential regulatory sequences in the human and mouse 
FOXC2 genomic sequences 

25 

In phylogenetic footprinting (for a review, see Duret & Bucher (1997) Current Opinion in 
Structural Biology 7(3): 399-406) sequences are aligned and a regional sequence identity is 
determined for each window of a fixed, arbitrary length. This allows the identification of 
potential regulatory regions in genomic sequences. Non-exon sequences that are conserved 
30 over the course of evolution are likely to perform regulatory roles. Phylogenetic 

footprinting was performed as described in Wasserman & Fickett (1998) J. Mol. Biol. 278, 
167-181, based on an alignment generated with the ClustalW multiple sequence alignment 
program, version 1.8 (Thompson et al. (1994) Nucleic Acids Research 22: 4673-4680), 
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with default parameters adjusted to a gap opening penalty of 20 and a gap extension 
penalty of 0.2. The human (SEQ ID NO: 1) and mouse (SEQ ID NO: 5) genomic 
sequences were aligned. Percentage identity was plotted for each contiguous 200 bp 
segment of the human gene to identify segments differentially conserved (in comparison to 
5 adjoining sequences) (Fig. 2). 

In addition to segments of the published exon sequence, two differentially conserved 
regions or "footprints" were identified in the human gene. Both of these regions are local 
maxima and contain segments which exceed 70% nucleotide identity between the human 

10 and mouse genomic sequences. One region, shown as positions 1250 to 2235, in particular 
positions 1250 to 1749, in SEQ ID NO: 1, immediately adjacent to the published exon 
region, is likely to contain the transcription start site and proximal promoter regulatory 
sequences (Fig. 4). Another region, shown as positions 216 to 475 in SEQ ID NO: 1, 
approximately 1 700 bp distal from the transcription start site, is likely to function as some 

15 form of regulatory region (either enhancer or repressor) (Fig. 3). (A schematic overview of 
the extended FOXC2 gene is shown in Fig. 1.) 

Further analysis of these regulatory regions identified short segments of higher 
conservation between the mouse and human genes, suggesting that these specific segments 

20 function as transcription factor binding sites. The TRANSFAC transcription factor database 
(http://transfac.gbf.de) (see Wingender et al. (2000) Nucleic Acids Research 28(1): 316- 
319) was screened for matches to known transcription factors. Consensus sites (identifiers 
R05066; R05067; R05068; and R05069) were found to match sequences conserved 
between the human FOXC2 and mouse FoxC2 genes. This suggests the presence of 

25 multiple forkhead-like binding sites in the distal regulatory enhancer, and potential auto- 
regulation of FOXC2 by its protein product. 

The same analysis was performed with reference to 200 bp contiguous segments of the 
mouse FoxC2 genomic sequence (SEQ ID NO: 5). The following conserved regions were 
30 identified: 190 to 420; 1070 to 1645; and 5580 to 5875. They correlate to the regions 
indicated above for the human sequence and should be considered orthologous regions. 
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EXAMPLE 3: Identification of an alternative human F0XC2 cDNA sequence 

BLASTN screening of the dbEST database from GenBank, using the human FOXC2 
cDNA (SEQ ID NO: 9) as a query sequence, revealed several ESTs overlapping containing 
5 portions of the available cDNA. A specialized tool, est_genome (http://www.sanger.aa uk), 
for the prediction of exon boundaries using ESTs was applied to compare the EST 
sequences to the genomic sequences (See Mott, R. (1997) Computer Applications in the 
Biosciences 13(4): 477-478). Two classes of ESTs were observed: sequences extending 
into the 3 '-untranslated region and sequences revealing an alternative first exon spliced to 
10 a junction internal to the previously described first exon. 

Specifically, it was found that the nucleotides in positions 33 to 182 in the EST with 
accession no. AW271272 (SEQ ID NO: 1 1) were identical to positions 66 to 215 in the 
extended FOXC2 genomic sequence (SEQ ID NO: 1), and that positions 1 83 to 327 in 

1 5 SEQ ID NO: 11 were identical to positions 25 1 6 to 2660 in SEQ ID NO; 1 . Similarly, 
positions 5 to 55 in the EST with accession no. AW793237 (SEQ ID NO: 12) were 
identical to positions 165 to 215 in the extended FOXC2 genomic sequence (SEQ ID NO: 
1), and positions 56 to 157 in SEQ ID NO: 12 were identical to positions 2516 to 2607 in 
SEQ ID NO: 1 . These results revealed an alternative splicing pattern in the human FOXC2 

20 gene. According to this splicing pattern, an alternative gene sequence (SEQ ID NO: 3) is 
derived by joining the regions shown as positions 1-215 and 2516-6458 in SEQ ID NO: 1. 
Alternative splicing patterns are known to regulate the synthesis of a variety of peptides 
and proteins. It may result in proteins with an entirely different function or in dysfunctional 
or inhibitory splice products (for a review, see McKeown (1992) Annu. Rev. Cell. Biol. 8: 

25 1 33-155). 

The amino acids corresponding to positions 1 to 94 in the published FOXC2 transcription 
factor (SEQ ID NO: 10) are missing in protein encoded by the spliced variant generated 
from the alternative promoter (SEQ ID NO: 4). Consequently, the entire region N-terminal 
30 of the DNA binding domain and a portion of the DNA-binding domain (corresponding to 
positions 72-94 in SEQ ID NO: 2) are not present in the splice variant. It is postulated that 
this truncation leads to a protein which has a deficient "forkhead" DNA-binding region, 
and thus has a potential inhibitory function on the biological activities of the FOXC2 
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protein. This truncated FOXC2 protein may have a role in regulation of FOXC2, and an 
involvment in adipocyte differentiation and adipogenesis. 

5 EXAMPLE 4: Cloning and sequencing of the FOXC2 promoter 

The DNA region corresponding to nucleotide 176 to nucleotide 2233 (SEQ ID NO. 1 
version 2) has been cloned using nested PCR on human genomic DNA. The PCR was 
performed according the Herculase™ protocol (Stratagene catalog #600260; 
10 http://www.stratagenexom/pcr/herculase.htm) and with the inclusion of 8-10% DMSO. 

In the initial reaction, the 5'-primer KRKX131 (CCATTGCCTTCTAGTCGCCTCC) was 
used together with the 3 '-primer KRKX133 (CGTTGGGGTCGGACACGGAGTA) using 
250 ng Clontech Genomic DNA # 6550-1 as template. The nested reaction was performed 

15 on 1/100 of the initial PCR reaction using the 5'-primer KRKX132 

(GGTACCTACGCAGCCGATGAACAGCCA) and the 3'-primer KRKX134 
(GCTAGCGCTGCTTCCGAGACGGCTCG). After the second PCR, the product was 
analyzed by electrophoresis in a 1 .2% agarose gel, and a PCR product of the expected size 
was obtained and extracted for ligation into a TOPO PCR2.1 vector (Invitrogen, Carlsbad, 

20 CA) by standard cloning procedures and thereafter sequenced. The PCR reaction and 
cloning procedure was repeated in two parallel separate experiments, and sequence data 
from the two separate reactions were compared with the bioinformatically assembled 
sequence. 

25 A DNA region containing the promoter (Fig. 4) corresponding to ntl 179 to 2233 (SEQ ID 
NO: 1, version 2) was has been cloned using nested PCR in the same manner as described 
above. In the initial reaction, the 5 '-primer KRKX136 

(GGTACCCCCCGAGCCTGGAAACTCCCT) was used together with the 3'-primer 
KRKX134 (GCTAGCGCTGCTTCCGAGACGGCTCG) using 250 ng genomic DNA as a 
30 template. The PCR reaction and cloning procedure was repeated in four parallel separate 
experiments, and sequence data from the four separate reactions were compared with the 
bioinformatically assembled sequence. 
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EXAMPLE 5: Tissue expression profiling of the alternative transcript 

Tissue expression profiling of the alternative transcript (SEQ ID NO: 3) is performed using 
5 standard Northern blotting procedures. RNA samples from an array of human tissues, 
including adipose tissue, are analyzed using an RNA or DNA probe specific for the 
alternative transcript. The expression profile in adipose tissue could be indicative a 
putative regulatory function for the alternative gene product (SEQ ID NO: 4) on 
adipogenesis and adipocyte differentiation. 

10 

In addition, reverse transcriptase PCR (RT-PCR) according to standard procedures is used 
to detect very low level expression of the alternative transcript in adipose tissue. RNA is 
prepared from human adipose tissue, and RT-PCR is performed using PCR primers 
specific for the alternative transcript. 

15 

EXAMPLE 6: Mapping of the 5'-edge of the alternative exon by RACE-PCR 

RNA is prepared from human adipose tissue using standard protocols. RACE (Rapid 
20 Amplification of cDNA Ends) PCR is performed using the SMART™ RACE cDNA 
Amplification Kit (Clontech catalogue No. K181 1-1; http://wwwxlontech.xom/product/ 
catalog/PCR/smartrace.html). With this procedure, the first strand synthesis produces 
cDNA with an extension containing a known sequence. Due to the mechanism of the 
extension procedure, the extension is typically added only to complete first strand cDNAs. 
25 The 5*-RACE PCR is then performed using the S'-primer provided with the kit, together 
with a 3'-primer corresponding to positions 210-237 in SEQ ID NO: 3 
(GAACTGGTAGATGCCGTTCAAGGTTTCC) specific for the alternative transcript. The 
PCR product is cloned into a cloning vector and sequenced using standard protocols. 



30 
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EXAMPLE 7: Functional analysis 

The identified regulatory regions are analyzed to determine their impact on the 
transcription of the FOXC2 gene or a reporter gene substituted for FOXC2. A PCR 
5 reaction is performed to isolate the promoter region adjacent to the published exon 
sequence, possibly including the sequences extending to the beginning of the ATG 
encoding the first methionine. This PCR product is cloned into a reporter plasmid adjacent 
to a reporter gene (e.g. luciferase). The upstream regulatory region, i.e. regions containing 
both upstream and promoter proximal sequences, or these sequences bearing artificially 

io induced differences, are cloned in a similar manner. These constructs are transfected into a 
cell culture model system and the level/activity of the protein encoded by the reporter gene 
is determined. This would provide information on the function of the identified regions, 
and used to assess the impact of the different regions on transcriptional regulation. 
Similarly, the upstream regulatory region, a region containing both upstream and promoter 

15 proximal sequences, or these sequences bearing artificially induced differences can be 
cloned and used to assess the impact of these regions on the transcription of the reporter 
gene. 



20 EXAMPLE 8: Reporter gene assay to identify modulating compounds 

Reporter gene assays are well known as tools to signal transcriptional activity in cells. (For 
a review of chemiluminescent and bioluminescent reporter gene assays, see Bronstein et al. 
(1994) Analytical Biochemistry 219, 169-181.) For instance, the photoprotein luciferase 

25 provides a useful tool for assaying for modulators of promoter activity. Cells are 

transiently transfected with a reporter construct which includes a gene for the luciferase 
protein downstream from the FOXC2 promoter and enhancer region, or fragments thereof 
regulating the FOXC2 activity. Luciferase activity may be quantitatively measured using 
e.g. luciferase assay reagents that are commercially available from Promega (Madison, 

30 WI). Differences in luminescence in the presence versus the absence of a candidate 
modulator compound are indicative of modulatory activity. 
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TABLE I 

Summary of FOXC2 sequences 



SEQ ID NO: 


GenBank 
accession no. 


Description 


1 




Human FOXC2 extended genomic DNA sequence 


2 




Human FOXC2 polypeptide sequence 
(Identical with SEQ ID NO: 10) 


3 




Human FOXC2 DNA sequence 
Alternative splicing 


4 




Human polypeptide sequence 
Alternative open reading frame 


5 


Y08222 


Mouse MHF-1 (FoxCZ) genomic DNA sequence 
(CDS 2070-3554) 




X74040 


Mouse MHF-1 (FoxC2) cDNA sequence 


7 


Mouse MHF-1 (FoxC2) polypeptide sequence 


8 


Y08223 


Human FKHL14 (FOXC2) genomic DNA sequence 
(CDS 1197 - 2702) 


9 


NMJ)05251 


Human FKHL14 (FOXC2) cDNA sequence 


10 


Human FKHL14 (FOXC2) polypeptide sequence 


11 


AW 271272 


Human EST 


12 


AW 793237 


Human EST 
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TABLE II 

Summary of features in human FOXC2 sequences shown as SEQ ID NOs: 1 and 3 



Feature 


Positions 


SEQ ID NO: 1 


First exon according to the alternative transcript 


1-215 


- Untranslated region 


1-186 


- Region coding for 5' -part of alternative protein 


187-215 


Alternative first exon splice site 


215-216 


Predicted enhancer region 


216-475 


- E*box-like region 


223-231 


- Forkhead-like region 


359-375 


- Forkhead-like region 


378-402 


- Ets-like region 


403-423 


Predicted promoter region 


1250-1749 


- Forkhead-like region 


1692-1703 


First exon according to the published form of the transcript 


1746 - 4629 


- Untranslated region 


1746 - 2234 


- Polypeptide coding region 


2235-3740 


- Region coding for DNA-binding domain 


2448 - 2735 


Second exon according to the alternative transcript 


2516-4629 


- Portion of polypeptide used in alternative transcript 


2516-3740 


— Untranslated region 


3741-4629 


SEQ ID NO: 3 


Polypeptide coding region (5* of splice site) 


187-215 


Polypeptide coding region (3' of splice site) 


216-1437 


- Region coding for truncated portion of protein 


216-435 



A human FOXC2 promoter region comprising a sequence selected from: 

(a) the nucleotide sequence set forth as positions 1692 to 1703 in SEQ ID 
NO: 1, or a fragment thereof exhibiting FOXC2 promoter activity; 

(b) the complementary strand of (a); and 

(c) nucleotide sequences capable of hybridizing, under stringent hybridization 
conditions, to a nucleotide sequence as defined in (a) or (b). 

The human FOXC2 promoter region according to claim 1, comprising a 
sequence selected from: 

(a) the nucleotide sequence set forth as positions 1250 to 1749 in SEQ ID 
NO: 1 , or a fragment thereof exhibiting FOXC2 promoter activity; 

(b) the complementary strand of (a); and 

(c) nucleotide sequences capable of hybridizing, under stringent hybridization 
conditions, to a nucleotide sequence as defined in (a) or (b). 

Hie human FOXC2 promoter region according to claim 2, comprising a 
sequence selected from: 

(a) the nucleotide sequence set forth as positions 1250 to 2235 in SEQ ID 
NO: 1 , or a fragment thereof exhibiting FOXC2 promoter activity; 

(b) the complementary strand of (a); and 

(c) nucleotide sequences capable of hybridizing, under stringent hybridization 
conditions, to a nucleotide sequence as defined in (a) or (b). 

A recombinant construct comprising the human FOXC2 promoter region according 
to any one of claims I to 3. 

The recombinant construct according to claim 4 wherein the human FOXC2 
promoter region is operably linked to a gene encoding a detectable product. 

The recombinant construct according to claim 5 wherein said gene encoding a 
detectable product is a human FOXC2 gene. 



00298-SE 2 



-20- 



5 



7. The recombinant construct according to claim 4 further comprising a reporter gene. 

8. A vector comprising the recombinant construct according to any one of claims 4 to 7. 

9. A host cell stably transformed with the recombinant construct according to any one 
of claims 4 to 7. 



10. A method for identification of an agent regulating FOXC2 promoter activity, said 
10 method comprising the steps 

(i) contacting a candidate agent with a human FOXC2 promoter region as defined in 
any one of claims 1 to 3; and 

(ii) determining whether said candidate agent modulates expression of the FOXC2 
gene, such modulation being indicative for an agent capable of regulating FOXC2 

is promoter activity. 

11. A method for identification of an agent regulating FOXC2 promoter activity, said 
method comprising assaying reporter gene expression in a cell according to claim 9 
in the presence and absence of a candidate agent, wherein an effect on the test level 

20 of expression as compared to control level of expression is indicative of an agent 

capable of regulating FOXC2 promoter activity. 

12. A human FOXC2 enhancer region comprising a sequence selected from: 

(a) the nucleotide sequence set forth as positions 223 to 231 in SEQ ID NO: 1, 
25 or a fragment thereof exhibiting FOXC2 enhancer activity; 

(b) the complementary strand of (a); and 

(c) nucleotide sequences capable of hybridizing, under stringent hybridization 
conditions, to a nucleotide sequence as defined in (a) or (b). 

30 13. A human FOXC2 enhancer region comprising a sequence selected from: 

(a) the nucleotide sequence set forth as positions 359 to 375 in SEQ ID NO: 1, 
or a fragment thereof exhibiting FOXC2 enhancer activity; 

(b) the complementary strand of (a); and 
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(c) nucleotide sequences capable of hybridizing, under stringent hybridization 
conditions, to a nucleotide sequence as defined in (a) or (b). 

14. A human FOXC2 enhancer region comprising a sequence selected from: 

(a) the nucleotide sequence set forth as positions 378 to 402 in SEQ ID NO: 1, 
or a fragment thereof exhibiting FOXC2 enhancer activity; 

(b) the complementary strand of (a); and 

(c) nucleotide sequences capable of hybridizing, under stringent hybridization 
conditions, to a nucleotide sequence as defined in (a) or (b). 



15. A human FOXC2 enhancer region comprising a sequence selected from: 

(a) the nucleotide sequence set forth as positions 403 to 423 in SEQ ID NO: I, 
or a fragment thereof exhibiting FOXC2 enhancer activity; 

(b) the complementary strand of (a); and 

15 (c) nucleotide sequences capable of hybridizing, under stringent hybridization 

conditions, to a nucleotide sequence as defined in (a) or (b). 

16. The human FOXC2 enhancer region according to any one of claims 12 to 15 
comprising a sequence selected from: 

20 (a) the nucleotide sequence set forth as positions 216 to 475 in SEQ ID NO: 1 , 

or a fragment thereof exhibiting FOXC2 enhancer activity; 

(b) the complementary strand of (a); and 

(c) nucleotide sequences capable of hybridizing, under stringent hybridization 
conditions, to a nucleotide sequence as defined in (a) or (b). 



1 7. A recombinant construct comprising a human FOXC2 enhancer region according to 
any one of claims 12 to 15. 

18. A vector comprising the recombinant construct according to claim 17. 

19. A host cell stably transformed with the recombinant construct according to claim 18. 
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20. A method for identification of an agent regulating FOXC2 enhancer activity, said 
method comprising the steps 

(i) contacting a candidate agent with the human FOXC2 enhancer region as defined 
in any one of claims 12 to 16; and 
5 (ii) determining whether said candidate agent modulates expression of the FOXC2 

gene, such modulation being indicative for an agent capable of regulating FOXC2 
enhancer activity. 

21. A method for identification of an agent capable of regulating FOXC2 enhancer 
to activity, said method comprising assaying reporter gene expression in a cell as 

defined in claim 19 in the presence and absence of a candidate agent, wherein an 
effect on the test level of expression as compared to control level of expression is 
indicative of an agent capable of regulating FOXC2 enhancer activity. 

15 22. A method for identification of an agent capable of regulating a mammalian FOXC2 
promoter activity, said method comprising the steps 

(i) contacting a candidate agent with a murine FoxC2 promoter nucleotide sequence 
shown as positions 1250 to 2235 in SEQ ID NO: 5; and 

(ii) determining whether said candidate agent modulates expression of a mammalian 
20 FOXC2 gene, such modulation being indicative for an agent capable of regulating 

mammalian FOXC2 promoter activity. 

23 . A method for identification of an agent capable of regulating a mammalian FOXC2 
enhancer activity, said method comprising the steps 

25 (i) contacting a candidate agent with a murine FoxC2 enhancer nucleotide sequence 

shown as positions 216 to 475 in SEQ ID NO: 5; and 

(ii) determining whether said candidate agent modulates expression of a mammalian 
FOXC2 gene, such modulation being indicative for an agent capable of regulating 
mammalian FOXC2 enhancer activity. 

30 

24. A method for identification of an agent capable of regulating a mammalian FOXC2 
enhancer activity, said method comprising the steps 
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(i) contacting a candidate agent with a murine FoxC2 enhancer nucleotide sequence 
shown as positions 216 to 2235 in SEQ ID NO: 5; and 

(ii) determining whether said candidate agent modulates expression of a mammalian 
FOXC2 gene, such modulation being indicative for an agent capable of regulating 

5 mammalian FOXC2 enhancer activity. 

25. An isolated nucleic acid molecule selected from: 

(a) nucleic acid molecules comprising a nucleotide sequence as shown in SEQ ID 
NO: 3; 

10 (b) nucleic acid molecules comprising a nucleotide sequence capable of hybridizing, 

under stringent hybridization conditions, to a nucleotide sequence complementary 
the polypeptide coding region of a nucleic acid molecule as defined in (a) and which 
codes for a variant form of the FOXC2 transcription factor; and 
(c) nucleic acid molecules comprising a nucleic acid sequence which is degenerate as 

15 a result of the genetic code to a nucleotide sequence as defined in (a) or (b) and 

which codes for a variant form of the FOXC2 transcription factor. 

26, An isolated polypeptide encoded by the nucleic acid according to claim 25. 

20 27. The isolated polypeptide according to claim 26 having an amino acid sequence 
shown as SEQ ID NO: 4 in the Sequence Listing 

28. A vector harboring the nucleic acid molecule according to claim 25. 

25 29. A replicable expression vector, which carries and is capable of mediating the 
expression of a nucleotide sequence according to claim 25. 

30. A cultured host cell harboring a vector according to claim 28 or 29. 

30 31. A process for production of a variant form of the FOXC2 transcription factor 

polypeptide, comprising culturing a host cell according to claim 30 under conditions 
whereby said polypeptide is produced, and recovering said polypeptide. 
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32. A method for identifying an agent capable of regulating expression of the nucleic 
acid molecule according to claim 25, said method comprising the steps 

(i) contacting a candidate agent with the said nucleic acid molecule; and 

(ii) determining whether said candidate agent modulates expression of the said 
5 nucleic acid molecule. 

33. An antisense oligonucleotide having a sequence capable of specifically hybridizing 
to RNA transcribed by the nucleic acid molecule according to claim 25, so as to 
prevent translation of the said RNA. 

10 

34. A method for the identification of polypeptides which bind to nucleotide sequences 
involved in the biological pathway regulating lipid metabolism and/or adipocyte 
differentiation, comprising 

(a) transfecting a host cell line with a human FOXC2 nucleotide sequence linked to a 
15 reporter gene; 

(b) transfecting the said host cell line with a variety of human cDNA sequences; 

(c) identifying and isolating cells having an altered level of expression of the said 
reporter gene; 

(d) recovering cDNA from the cells isolated in step (c); and 

20 (e) identifying the polypeptide expressed by the cDNA recovered in step (d). 



ABSTRACT 



The present invention relates an isolated promoter region of the mammalian transcription 
factor FOXC2. The invention also relates to screening methods for agents modulating the 
expression of FOXC2 and thereby being potentially useful for the treatment of medical 
conditions related to obesity. The invention further relates to a previously unknown variant 
of the human FOXC2 gene, derived via the use of an alternative promoter, which produces 
an additional exon that generates a distinct open reading frame via splicing. The alternative 
gene encodes a variant of the FOXC2 transcription factor, which is lacking a part of the 
DNA-binding domain and consequently has a potential regulatory function. 
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SEQUENCE LISTING 

<110> Pharmacia & Upjohn AB 

<120> Promoter Sequences 

<130> 00298 

<140> 
<141> 

<160> 12 

<170> Patentln Ver* 2.1 

<210> 1 

<211> 6458 

<212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 

<222> (2235) . . (3740) 
<400> 1 





cctttggctt 


tgaattgatc 


aggagacaaa 


gataatgcat 


ctacattttc 


gtcttctgtt 


60 




cttttattgg 


aaataagtgg 


cacgccccat 


tgccttctag 


tcgcctcccc 


gaagcgaaga 


120 




ggccgaagcg 


aagaggcctg 


gtgggttgtc 


tcaacatcct 


tttgctgaga 


atcgaatacg 


180 




cagccgatga 


acagccagga 


agggtgcaag 


gaaacctgaa 


atacaaatgt 


tctccctgaa 


240 




gccctcttcc 


ctgcccaacc 


agaccagcaa 


cttccaaaat 


tctgcccgtg 


tttagccttg 


300 




ttaaaggggt 


gtctcactcc 


ttcagggaaa 


gtgggaaaag 


gggatctgat 


tattgaggtg 


360 




tggaaggaat 


aaataatcag 


tccacaaata 


aacaaactgt 


ccgggattcc 


tagagggaag 


420 




gagaaatcct 


tgaaggagat 


ccaagtcgct 


ccaggtctgc 


ctgccgaata 


atatcatccc 


480 




gaagggatct 


tgaaccgttt 


gcaatcaacc 


gctcacccag 


tcttcccacg 


gagcgcgctc 


540 




cctaactcac 


cctacccacc 


caacaaaaca 


aaaaaaaggc 


tgaaatatag 


aaaagcaact 


600 




tggaggctcc 


cagggggacg 


ttgccaggag 


caggaggcag 


ggacagcgcc 


ctagggtcgg 


660 




tgttagcggc 


cggcgccggc 


ctgggccacg 


ggaaacgtcc 


acgcttggtg 


cccgcggtgc 


720 




gcggcgctca 


ttgcgcgcgc 


cttcgagcca 


agcccccgcg 


gaaaacaggc 


tcgggtttct 


780 




cctcgcaggg 


cccaggaact 


cggctctgcc 


tggcccgggt 


gggtcgctgc 


attgtcccgg 


840 




tcttctggga 


gtgcggggtc 


agcttgttag 


agggaatttc 


tacctgggaa 


aagggagacg 


900 


* • 


agtttcgaag 


ctgaagttgg 


taggctgcga 


gtgtccacgc 


gggagacgaa 


agggggaaat 


960 




agcagagt ca 


cttcaccctt 


ttccccaaac 


cccacaaaac 


tgctcgcagc 


gacgcggatg 


1020 
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at ctaccgaa 


t tccccgcga 


at teggagga 


t raagr tgtc 


agccagcacg 


ttgctacctt 


xuo u 


cccctctatg 


cactccgctg 


cctggctcct 


eggeggggag 


cgagggaaac 


tcagtttgta 


1 14U 


gggt ttacct 


ctaaaacct c 


gataggttat 


cct tgacgac 


cccgagcctg 


gaaac tccct 


1ZUU 


gt tgatgatt 


— — 4-4. _ 4_ 4- 4- 

aatt at ttiga 


t taaataagt 


ataacat cca 


ggagaggece 


t gee at t cca 


1<£ DU 


atccagcgcg 


tttgcttttg 


a ate cat tac 


acctgggccc 


ccataatt ag 


gaaatctaat 


1 J-£U 


csttcgcttc 


atcact ca 1 1 


aataagaaaa 


a tgtcccagg 


ateattgeta 


cccacaagyT, 


J. jo V 


ctttgggaga 


gatattttac 


tctattaatc 


_ 4_ 4- _ 4- — 4-4-4- 

cattctattt 


tatatttcaa 


4- 4- _ — . 4- 4-4-4-4- 

attgattttt 


14 4U 


tttaacagag 


gaaagtggct 


atctttttgt 


tttgggcatg 


tgggeccatt 


caccaaaatg 


1500 


tgatcataaa 


ataaatttta 


ataagatata 


actttttaaa 


aagttttcaa 


gtgaagaegg 


1560 


agtcgccgcg 


gaggccgggg 


eggeggggtc 


ttagagcega 


eggattcctg 


cgctcctcgc 


1620 


cccgattggc 


gccggactcc 


tctcagctgc 


cgggtgattg 


gctcaaagtt 


ccgggagggg 


1 Con 


gcgtggcccg 


aggaaagtaa 


aaactcgett 


t cagcaagaa 


gacttttgaa 


acttttccca 


1/40 


atccctaaaa 


gggacttggc 


ctctxtttct 


gggctcagcg 


gggcagccgc 


tcggaccccg 


1UUU 


gcgcgctgac 


cctcggggct 


gccgattcgc 


tgggggcttg 


gagagcctcc 


tgcgcccctc 


1860 


ctcgcgcggg 


ccgagggtcc 


accttggtcc 


ccaggccgcg 


gcgtctccgc 


tgggtccgcg 


1920 


gccgcccgcc 


tgcccgcgct 


gccgccgccg 


ggtcctggag 


ecagegagga 


geggggcegg 


1980 


cgctgcgctt 


gcccggggcg 


cgccctccag 


gatgecgate 


cgcccggtcc 


getgaaageg 


2040 


cgcgcccctg 


ctcggcccga 


gegacgaega 


ccgcgcaccc 


tcgccccgga 


ggctgecagg 


2100 


agaccggggc 


cgcccctccc 


gctcccctcc 


tctccccctc 


tggctctctc 


gcgctctctc 


2160 


gctctcaggg 


cccccctcgc 


tcccccggcc 


geagtcegtg 


cgcgagggcg 


ccggcgagcc 


2220 


gtctcggaag 


cage atg cag gcg cgc tac tec gtg 
Met Gin Ala Arg Tyr Ser Val 
1 5 


tec gac ccc 
Ser Asp Pro 
10 


: aac gec 
i Asn Ala 


2270 



ctg gga gtg gtg ccc tac ctg age gag cag aat tac tac egg get gcg 2318 

Leu Gly Val Val Pro Tyr Leu Ser Glu Gin Asn Tyr Tyr Arg Ala Ala 
15 20 25 

ggc age tac ggc ggc atg gec age ccc atg ggc gtc tat tec ggc cac 2366 

Gly Ser Tyr Gly Gly Met Ala Ser Pro Met Gly Val Tyr Ser Gly His 
30 35 40 

ccg gag cag tac age gcg ggg atg ggc cgc tec tac gcg ccc tac cac 2414 

Pro Glu Gin Tyr Ser Ala Gly Met Gly Arg Ser Tyr Ala Pro Tyr His 

45 50 55 60 

cac cac cag ccc gcg gcg cct aag gac ctg gtg aag ccg ccc tac age 24 62 

His His Gin Pro Ala Ala Pro Lys Asp Leu Val Lys Pro Pro Tyr Ser 
65 70 75 
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tac ate gcg etc ate acc atg gec ate cag aac gcg ccc gag aag aag 2510 
Tyr lie Ala Leu He Thr Met Ala He Gin Asn Ala Pro Glu Lys Lys 
80 85 90 

ate acc ttg aac ggc ate tac cag ttc ate atg gac cgc ttc ccc ttc 2558 
He Thr Leu Asn Gly He Tyr Gin Phe He Met Asp Arg Phe Pro Phe 
95 100 105 

tac egg gag aac aag cag ggc tgg cag aac age ate cgc cac aac etc 2606 
Tyr Arg Glu Asn Lys Gin Gly Trp Gin Asn Ser He Arg His Asn Leu 
110 115 120 

teg etc aac gag tgc ttc gtc aag gtg ccc cgc gac gac aag aag ccc 2654 
Ser Leu Asn Glu Cys Phe Val Lys Val Pro Arg Asp Asp Lys Lys Pro 
125 130 135 140 

ggc aag ggc agt tac tgg acc ctg gac ccg gac tec tac aac atg ttc 2702 
Gly Lys Gly Ser Tyr Trp Thr Leu Asp Pro Asp Ser Tyr Asn Met Phe 
145 150 155 

gag aac ggc age ttc ctg egg cgc egg egg cgc ttc aaa aag aag gac 2750 
Glu Asn Gly Ser Phe Leu Arg Arg Arg Arg Arg Phe Lys Lys Lys Asp 
160 165 170 

gtg tec aag gag aag gag gag egg gec cac etc aag gag ccg ccc ccg 27 98 
Val Ser Lys Glu Lys Glu Glu Arg Ala His Leu Lys Glu Pro Pro Pro 
175 180 185 

gcg gcg tec aag ggc gec ccg gec acc ccc cac eta gcg gac gec ccc 2846 
Ala Ala Ser Lys Gly Ala Pro Ala Thr Pro His Leu Ala Asp Ala Pro 
190 195 200 

aag gag gec gag aag aag gtg gtg ate aag age gag gcg gcg tec ccg 2894 
Lys Glu Ala Glu Lys Lys Val Val He Lys Ser Glu Ala Ala Ser Pro 
205 210 215 220 

gcg ctg ccg gtc ate acc aag gtg gag acg ctg age ccc gag age gcg 2942 
Ala Leu Pro Val He Thr Lys Val Glu Thr Leu Ser Pro Glu Ser Ala 
225 230 235 

ctg cag ggc age ccg cgc age gcg gee tec acg ccc gec ggc tec ccc 2 990 
Leu Gin Gly Ser Pro Arg Ser Ala Ala Ser Thr Pro Ala Gly Ser Pro 
240 245 250 

gac ggt teg ctg ccg gag cac cac gec gcg gcg ccc aac ggg ctg cct 3038 
Asp Gly Ser Leu Pro Glu His His Ala Ala Ala Pro Asn Gly Leu Pro 
255 260 265 

ggc ttc age gtg gag aac ate atg acc ctg cga acg teg ccg ccg ggc 3086 
Gly Phe Ser Val Glu Asn He Met Thr Leu Arg Thr Ser Pro Pro Gly 
270 275 280 

gga gag ctg age ccg ggg gec gga cgc gcg ggc ctg gtg gtg ccg ccg 3134 
Gly Glu Leu Ser Pro Gly Ala Gly Arg Ala Gly Leu Val Val Pro Pro 
285 290 295 300 
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ctg gcg ctg cca tac gcc gcc gcg ccg ccc gcc gcc tac ggc cag ccg 3182 
Leu Ala Leu Pro Tyr Ala Ala Ala Pro Pro Ala Ala Tyr Gly Gin Pro 
305 310 315 

tgc get cag ggc ctg gag gcc ggg gcc gcc ggg ggc tac cag tgc age 3230 
Cys Ala Gin Gly Leu Glu Ala Gly Ala Ala Gly Gly Tyr Gin Cys Ser 
320 325 330 

atg cga gcg atg age ctg tac acc ggg gcc gag egg ccg gcg cac atg 3278 
Met Arg Ala Met Ser Leu Tyr Thr Gly Ala Glu Arg Pro Ala His Met 
335 340 345 

tgc gtc ccg ccc gcc ctg gac gag gcc etc teg gac cac ccg age ggc 332 6 
Cys Val Pro Pro Ala Leu Asp Glu Ala Leu Ser Asp His Pro Ser Gly 
350 355 360 

ccc acg teg ccc ctg age get etc aac etc gcc gcc ggc cag gag ggc 3374 
Pro Thr Ser Pro Leu Ser Ala Leu Asn Leu Ala Ala Gly Gin Glu Gly 
365 370 375 380 

gcg etc gcc gcc acg ggc cac cac cac cag cac cac ggc cac cac cac 34 22 
Ala Leu Ala Ala Thr Gly His His His Gin His His Gly His His His 
385 390 395 

ccg cag gcg ccg ccg ccc ccg ccg get ccc cag ccc cag ccg acg ccg 347 0 
Pro Gin Ala Pro Pro Pro Pro Pro Ala Pro Gin Pro Gin Pro Thr Pro 
400 405 410 

cag ccc ggg gcc gcc gcg gcg cag gcg gcc tec tgg tat etc aac cac 3518 
Gin Pro Gly Ala Ala Ala Ala Gin Ala Ala Ser Trp Tyr Leu Asn His 
415 420 425 

age ggg gac ctg aac cac etc ccc ggc cac acg ttc gcg gcc cag cag 3566 
Ser Gly Asp Leu Asn His Leu Pro Gly His Thr Phe Ala Ala Gin Gin 
430 435 440 

caa act ttc ccc aac gtg egg gag atg ttc aac tec cac egg ctg ggg 3614 
Gin Thr Phe Pro Asn Val Arg Glu Met Phe Asn Ser His Arg Leu Gly 
445 450 455 460 

att gag aac teg acc etc ggg gag tec cag gtg agt ggc aat gcc age 3662 
lie Glu Asn Ser Thr Leu Gly Glu Ser Gin Val Ser Gly Asn Ala Ser 
465 470 475 

tgc cag ctg ccc tac aga tec acg ccg cct etc tat cgc cac gca gcc 3710 
Cys Gin Leu Pro Tyr Arg Ser Thr Pro Pro Leu Tyr Arg His Ala Ala 
480 485 490 

ccc tac tec tac gac tgc acg aaa tac tga cgtgtcccgg gacctcccct 37 60 

Pro Tyr Ser Tyr Asp Cys Thr Lys Tyr 
495 500 

ccccggcccg ctccggcttc gcttcccagc cccgacccaa ccagacaatt aaggggctgc 3820 

agagaegcaa aaaagaaaca aaacatgtcc accaaccttt tctcagaccc gggagcagag 3880 

agegggcacg ctagccccca gccgtctgtg aagagcgcag gtaactttaa ttcgccgccc 394 0 

cgtttctggg atcccaggaa acccctccaa agggaegcag cccaacaaaa tgagtattgg 4000 



tcttaaaatc 


cccctcccct 


accaggacgg 


gttaagttat 


ggacccaaat 


cccatagcga 


taggtgtatg 


ggggtctcta 


tagataatat 


accgtgctgt 


acaaatgtgt 


ggatttgtaa 


cagagccatt 


aatataatat 


ttaaagttga 


accatttcta 


actgccaaat 


tgaattcaag 


ttatgagata 


taattctttt 


tcccattgta 


ttttttgttg 


gtggataaag 


aagtcaagta 


gttttgtata 


gtaggttcca 


ccctgagtat 


aaactctaac 


ttcatctgtg 


tttgtcttac 


taaacccatg 


ttgttttttc 


tgcccaaagt 


ttacaaacga 


ggtgtgtttg 


caaacccacc 


tatatgtgta 


gacacataaa 


aacgaccaga 


agtgacagaa 


aaaggctttt 


gattaatttt 


agaaaccgcc 


cagttggagg 


gggctgcctg 


attttcttaa 


atgcacaaaa 


acatgctaat 


gtgtccagcc 


gtccccagtt 


taggaggtga 


ctaactgcaa 


cccagggtga 


gtcctgcttt 


gcctgcttta 


atagttttcc 


agagaatttg 


gattttctag 


cgtttgatat 


ccatccccct 


tttattactt 


ctcaatgtca 


tgtctaaatc 


acctcattac 


ccttcaaaaa 


taatttatga 


aatatgttaa 


ataatagaga 


ttattttttg 


ttttaatagt 


aacatagttt 


ttgtgaaatg 


caatggctga 


agtccaccac 


tcccctgctg 


ttcatcaatt 


cccaccccag 


caggtgagct 


gcttaacgtt 


tcaagaccag 


atgattttgc 


tgaacgtaac 


ccgggtgttt 


ttgtcgtgtt 


atgttgaaat 


agaattaggg 


gaagcttaaa 



ctgtgctgtg ctcgacctga gctttcaaaa 4060 
gcccctagtg actttctgta ggggtcccca 4120 
atgtgctgtg tgtaatttta aatttctcca 4180 
tcaggctatt ttgttgttgt tgttgttgtt 4240 
gttcactgga taagtttttc atcttgccca 4300 
aaaccgatgt gggttttgtt tcctgtacaa 4360 
ggtcttttac aaaacaagaa aataatttat 4420 
tctgatactt tttatttaca aagtgtgatg 4480 
tcctaaaaga aaaaaaaaaa aaaagcttaa 4540 
gtggtcttaa tcgttgtact taccttaaaa 4600 
ttggacagtg tgtttgtgtt gttgcatttt 4 660 
tgctttgatt atttttgtta cacaggtggg 4720 
gaataggagc acacacctgc tgtcttgttt 4780 
aaaatcccac tctaggattt tttcttttcg 4840 
aaggaccgga ccatgagttt gccgtgatgc 4 900 
tgtcaaaaca aacagtgcca ctccatctca 4 960 
aggaagggaa gaataaacat ttcccgtttg 5020 
cccccgattt tataaaattt gagcctcttt 5080 
aactgggcca atgaaggtct gaaggggacg 514 0 
tagcggccag atcagagggg aatttcagac 5200 
tacaccctca tcgcagtgaa aaattttaaa 5260 
tatttttaga gttctaaatt caagtttttc 5320 
ttttcaatgt taatatctcg tcttttacat 5380 
tagctgacga aatggcttta ttatctattt 54 40 
gcctctatgt gtgaatttgg ggaccaaagc 5500 
gtaccttgct aatgctgaag ttctttgtga 5560 
taaaggtgat tttgcttgat gcagtggcgc 5620 
gttttcaaca tggcacttta tctccacgct 5680 
gcataataat tgtccccaca tgtgcaacac 5740 
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agactctttc 


aatctgtggc 


cccagaggtg 


gcacacagtt 


aagacttggc 


ggctgtctca 


5800 


ttctttttca 


taatgtgcgg 


gttcccgggt 


gtccgggtgc 


tagactttca 


gcaggcccca 


5860 


ggccagacgg 


gctttggttg 


agtgaacagg 


aggaggaagt 


taaggaggta 


ggggtgggga 


5920 


gagaccctct 


ccaagctgca 


gaagaaggtg 


gcccaagctc 


cttgcctgcg 


tctgccgtga 


5980 


tggtttcatt 


ttacttctgc 


tcgcttcatg 


ctatttgccc 


caggagaaga 


ggagagtatt 


6040 


ccagacggta 


agcgagctgg 


ctttttccct 


tccctagacg 


ttttaaagaa 


atctttctga 


6100 


aagcttgccc 


tcatcgtaag 


ctttgaaacc 


gttggtgtcc 


tgttagtggc 


gagggctgag 


6160 


agacacgcgg 


agaaataaag 


gagagcgacg 


gtgtggctga 


gagcccccag 


gtctgctgtt 


6220 


gaaactaagc 


tggqcttttq 


cacctttagg 


aagccttttt 


aaagaagtcc 


tgctgtgtgg 


6280 


gggccggaag 


cccaagtgag 


tgggccttgt 


ggaggttatc 


gggaggggtc 


tttaccactc 


6340 


cttggggaac 


gtgggcaacg 


gggggattgt 


atctgaagct 


ttattcaggt 


cttcggcggc 


6400 


agcagagtgg 


agaaccaggc 


ccttagtgtg 


tagcggcctg 


gggattttgg 


gactcatc 


6458 



<210> 2 
<211> 501 
<212> PRT 

<213> Homo sapiens 



<400> 2 





Met 


Gin 


Ala 


Arg 


Tyr 


Ser 


Val 


Ser 


Asp 


Pro 


Asn 


Ala 


Leu 


Gly 


Val 


Val 




1 








5 










10 










15 






Pro 


Tyr 


Leu 


Ser 


Glu 


Gin 


Asn 


Tyr 


Tyr 


Arg 


Ala 


Ala 


Gly 


Ser 


Tyr 


Gly 










20 










25 










30 








Gly 


Met 


Ala 


Ser 


Pro 


Met 


Gly 


Val 


Tyr 


Ser 


Gly 


His 


Pro 


Glu 


Gin 


Tyr 








35 










40 










45 










Ser 


Ala 


Gly 


Met 


Gly 


Arg 


Ser 


Tyr 


Ala 


Pro 


Tyr 


His 


His 


His 


Gin 


Pro 






50 










55 










60 












Ala 


Ala 


Pro 


Lys 


Asp 


Leu 


Val 


Lys 


Pro 


Pro 


Tyr 


Ser 


Tyr 


He 


Ala 


Leu 




65 










70 










75 










80 




lie 


Thr 


Met 


Ala 


He 


Gin 


Asn 


Ala 


Pro 


Glu 


Lys 


Lys 


He 


Thr 


Leu 


Asn 












85 










90 










95 






Gly 


lie 


Tyr 


Gin 


Phe 


He 


Met 


Asp 


Arg 


Phe 


Pro 


Phe 


Tyr 


Arg 


Glu 


Asn 










100 










105 










110 








Lys 


Gin 


Gly 


Trp 


Gin 


Asn 


Ser 


He 


Arg 


His 


Asn 


Leu 


Ser 


Leu 


Asn 


Glu 








115 










120 










125 










Cys 


Phe 


Val 


Lys 


Val 


Pro 


Arg 


Asp 


Asp 


Lys 


Lys 


Pro 


Gly 


Lys 


Gly 


Ser 


. » » 




130 










135 










140 












Tyr 


Trp 


Thr 


Leu 


Asp 


Pro 


Asp 


Ser 


Tyr 


Asn 


Met 


Phe 


Glu 


Asn 


Gly 


Ser 




145 










150 










155 










160 


m . m 


Phe 


Leu 


Arg 


Arg 


Arg 


Arg 


Arg 


Phe 


Lys 


Lys 


Lys 


Asp 


Val 


Ser 


Lys 


Glu 


» * 










165 










170 










175 






Lys 


Glu 


Glu 


Arg 


Ala 


His 


Leu 


Lys 


Glu 


Pro 


Pro 


Pro 


Ala 


Ala 


Ser 


Lys 










180 










185 










190 








Gly 


Ala 


Pro 


Ala 


Thr 


Pro 


His 


Leu 


Ala 


Asp 


Ala 


Pro 


Lys 


Glu 


Ala 


Glu 


* - * - * 






195 










200 










205 










Lys 


Lys 


Val 


Val 


He 


Lys 


Ser 


Glu 


Ala 


Ala 


Ser 


Pro 


Ala 


Leu 


Pro 


Val 



210 215 220 



.1 
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lie 


Thr 


Lys 


Val 


Glu 


Thr 


Leu 


Ser 


Pro 


Glu 


Ser 


Ala 


Leu 


Gin 


Gly 


Ser 


225 








230 










235 












Pro 


Arg 


Ser 


Ala 


Ala 


Ser 


Thr 


Pro 


Ala 


Gly 


Ser 


Pro 


Asp 


Gly 


Ser 


Leu 










245 










250 










1 c c 

255 




Pro 


Glu 


His 


His 


Ala 


Ala 


Ala 


Pro 


Asn 


Gly 


Leu 


Pro Gly 


Phe 


Ser 


Val 








260 










265 










270 






Glu 


Asn 


lie 


Met 


Thr 


Leu 


Arg 


Thr 


Ser 


Pro 


Pro Gly Gly 


Glu 


Leu 


Ser 






275 










280 










285 








Pro 


Gly 


Ala 


Gly 


Arg 


Ala 


Gly 


Leu 


Val 


Val 


Pro 


Pro 


Leu 


Ala 


Leu 


Pro 




290 










295 










300 










Tyr 


Ala 


Ala 


Ala 


Pro 


Pro 


Ala 


Ala 


Tyr 


Gly Gin 


Pro 


Cys 


Ala 


Gin 


Gly 


305 










310 










315 










320 


Leu 


Glu 


Ala 


Gly 


Ala 


Ala 


Gly 


Gly 


Tyr 


Gin 


Cys 


Ser 


Met 


Arg 


Ala 


Met 










325 










330 










335 




Ser 


Leu 


Tyr 


Thr 


Gly 


Ala 


Glu 


Arg 


Pro 


Ala 


His 


Met 


Cys 


Val 


Pro 


Pro 








340 










345 










350 






Ala 


Leu 


Asp 


Glu 


Ala 


Leu 


Ser 


Asp 


His 


Pro 


Ser Gly 


Pro 


Thr 


Ser 


Pro 






355 










360 










365 








Leu 


Ser 


Ala 


Leu 


Asn 


Leu 


Ala 


Ala 


Gly 


Gin 


Glu Gly Ala 


Leu 


Ala 


Ala 




370 










375 










380 










Thr 


Gly 


His 


His 


His 


Gin 


His 


His 


Gly 


His 


His 


His 


Pro 


Gin 


Ala 


Pro 


385 










390 










395 










400 


Pro 


Pro 


Pro 


Pro 


Ala 


Pro 


Gin 


Pro 


Gin 


Pro 


Thr 


Pro 


Gin 


Pro 


Gly 


Ala 










405 










410 










415 




Ala 


Ala 


Ala 


Gin 


Ala 


Ala 


Ser 


Trp 


Tyr 


Leu 


Asn 


His 


Ser 


Gly Asp 


Leu 








420 










425 










430 






Asn 


His 


Leu 


Pro 


Gly 


His 


Thr 


Phe 


Ala 


Ala 


Gin 


Gin 


Gin 


Thr 


Phe 


Pro 






435 










440 










445 








Asn 


Val 


Arg 


Glu 


Met 


Phe 


Asn 


Ser 


His 


Arg 


Leu 


Gly 


lie 


Glu 


Asn 


Ser 




450 










455 










460 










Thr 


Leu 


Gly 


Glu 


Ser 


Gin 


Val 


Ser 


Gly 


Asn 


Ala 


Ser 


Cys 


Gin 


Leu 


Pro 


465 










470 










475 










480 


Tyr 


Arg 


Ser 


Thr 


Pro 


Pro 


Leu 


Tyr 


Arg 


His 


Ala 


Ala 


Pro 


Tyr 


Ser 


Tyr 










485 










490 










495 




Asp 


Cys 


Thr 


Lys 


Tyr 

























500 



<210> 3 

<211> 4158 

<212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 

<222> £187) . . (1437) 
<400> 3 

cctttggctt tgaattgatc aggagacaaa gataatgcat ctacattttc gtcttctgtt 60 
cttttattgg aaataagtgg cacgccccat tgccttctag tcgcctcccc gaagcgaaga 120 
ggccgaagcg aagaggcctg gtgggttgtc tcaacatcct tttgctgaga atcgaatacg 180 



cagccg atg aac age cag gaa ggg tgc aag gaa acc ttg aac ggc ate 228 
Met Asn Ser Gin Glu Gly Cys Lys Glu Thr Leu Asn Gly lie 
15 10 
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tac cag ttc ate atg gac cgc ttc ccc ttc tac egg gag aac aag cag 
Tyr Gin Phe lie Met Asp Arg Phe Pro Phe Tyr Arg Glu Asn Lys Gin 
15 20 25 30 



276 



ggc tgg cag aac age ate cgc cac aac etc teg etc aac gag tgc ttc 
Gly Trp Gin Asn Ser lie Arg His Asn Leu Ser Leu Asn Glu Cys Phe 
35 40 45 



324 



gtc aag gtg ccc cgc gac gac aag aag ccc ggc aag ggc agt tac tgg 
Val Lys Val Pro Arg Asp Asp Lys Lys Pro Gly Lys Gly Ser Tyr Trp 
50 55 60 



372 



acc ctg gac ccg gac tec tac aac atg ttc gag aac ggc age ttc ctg 
Thr Leu Asp Pro Asp Ser Tyr Asn Met Phe Glu Asn Gly Ser Phe Leu 
65 70 75 



420 



egg cgc egg egg cgc ttc aaa aag aag gac gtg tec aag gag aag gag 
Arg Arg Arg Arg Arg Phe Lys Lys Lys Asp Val Ser Lys Glu Lys Glu 
80 85 90 



4 68 



gag egg gee cac etc aag gag ccg ccc ccg gcg gcg tec aag ggc gec 
Glu Arg Ala His Leu Lys Glu Pro Pro Pro Ala Ala Ser Lys Gly Ala 
95 100 105 110 



516 



ccg gee acc ccc cac eta gcg gac gee ccc aag gag gee gag aag aag 
Pro Ala Thr Pro His Leu Ala Asp Ala Pro Lys Glu Ala Glu Lys Lys 
115 120 125 



564 



gtg gtg ate aag age gag gcg gcg tec ccg gcg ctg ccg gtc ate acc 
Val Val lie Lys Ser Glu Ala Ala Ser Pro Ala Leu Pro Val lie Thr 
130 135 140 



612 



aag gtg gag acg ctg age ccc gag age gcg ctg cag ggc age ccg cgc 
Lys Val Glu Thr Leu Ser Pro Glu Ser Ala Leu Gin Gly Ser Pro Arg 
145 150 155 



660 



age gcg gec tec acg ccc gee ggc tec ccc gac ggt teg ctg ccg gag 
Ser Ala Ala Ser Thr Pro Ala Gly Ser Pro Asp Gly Ser Leu Pro Glu 
160 165 170 



708 



cac cac gee gcg gcg ccc aac ggg ctg cct ggc ttc age gtg gag aac 
His His Ala Ala Ala Pro Asn Gly Leu Pro Gly Phe Ser Val Glu Asn 
175 180 185 190 



756 



ate atg acc ctg cga acg teg ccg ccg ggc gga gag ctg age ccg ggg 
lie Met Thr Leu Arg Thr Ser Pro Pro Gly Gly Glu Leu Ser Pro Gly 
195 200 205 



804 



gec gga cgc gcg ggc ctg gtg gtg ccg ccg ctg gcg ctg cca tac gee 
Ala Gly Arg Ala Gly Leu Val Val Pro Pro Leu Ala Leu Pro Tyr Ala 
. 210 215 220 



852 



gec gcg ccg ccc gec gee tac ggc cag ccg tgc get cag ggc ctg gag 
Ala Ala Pro Pro Ala Ala Tyr Gly Gin Pro Cys Ala Gin Gly Leu Glu 
225 230 235 



900 



gee ggg gec gec ggg ggc tac cag tgc age atg cga gcg atg age ctg 
Ala Gly Ala Ala Gly Gly Tyr Gin Cys Ser Met Arg Ala Met Ser Leu 
240 245 250 



948 
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tac acc ggg gcc gag egg ccg gcg cac atg tgc gtc ccg ccc gec ctg 996 
Tyr Thr Gly Ala Glu Arg Pro Ala His Met Cys Val Pro Pro Ala Leu 
255 260 265 270 

gac gag gcc etc teg gac cac ccg age ggc ccc acg teg ccc ctg age 1044 
Asp Glu Ala Leu Ser Asp His Pro Ser Gly Pro Thr Ser Pro Leu Ser 
275 280 285 

get etc aac etc gcc gcc ggc cag gag ggc gcg etc gcc gcc acg ggc 1092 
Ala Leu Asn Leu Ala Ala Gly Gin Glu Gly Ala Leu Ala Ala Thr Gly 
290 295 300 

cac cac cac cag cac cac ggc cac cac cac ccg cag gcg ccg ccg ccc 1140 
His His His Gin His His Gly His His His Pro Gin Ala Pro Pro Pro 
305 310 315 

ccg ccg get ccc cag ccc cag ccg acg ccg cag ccc ggg gcc gcc gcg 1188 
Pro Pro Ala Pro Gin Pro Gin Pro Thr Pro Gin Pro Gly Ala Ala Ala 
320 325 330 

gcg cag gcg gcc tec tgg tat etc aac cac age ggg gac ctg aac cac 1236 
Ala Gin Ala Ala Ser Trp Tyr Leu Asn His Ser Gly Asp Leu Asn His 
335 340 345 350 

etc ccc ggc cac acg ttc gcg gcc cag cag caa act ttc ccc aac gtg 1284 
Leu Pro Gly His Thr Phe Ala Ala Gin Gin Gin Thr Phe Pro Asn Val 
355 360 365 

egg gag atg ttc aac tec cac egg ctg ggg att gag aac teg acc etc 1332 
Arg Glu Met Phe Asn Ser His Arg Leu Gly lie Glu Asn Ser Thr Leu 
370 375 380 

ggg gag tec cag gtg agt ggc aat gcc age tgc cag ctg ccc tac aga 1380 
Gly Glu Ser Gin Val Ser Gly Asn Ala Ser Cys Gin Leu Pro Tyr Arg 
385 390 395 

tec acg ccg cct etc tat cgc cac gca gcc ccc tac tec tac gac tgc 1428 
Ser Thr Pro Pro Leu Tyr Arg His Ala Ala Pro Tyr Ser Tyr Asp Cys 
400 405 410 

acg aaa tac tgacgtgtcc cgggacctcc cctccccggc ccgctccggc 1477 

Thr Lys Tyr 

415 





ttcgcttccc 


agccccgacc 


caaccagaca 


attaaggggc 


tgcagagacg 


eaaaaaagaa 


1537 




acaaaacatg 


tccaccaacc 


ttttctcaga 


cccgggagca 


gagageggge 


acgctagccc 


1597 




ccagccgtct 


gtgaagagcg 


caggtaactt 


taattcgecg 


ccccgtttct 


gggatcccag 


1657 




gaaacccctc 


caaagggacg 


cagcccaaca 


aaatgagtat 


tggtcttaaa 


atccccctcc 


1717 




cctaccagga 


cggctgtgct 


gtgctcgacc 


tgagctttca 


aaagttaagt 


tatggaccca 


1777 




aatcccatag 


cgagccccta 


gtgactttct 


gtaggggtcc 


ccataggtgt 


atgggggtct 


1837 




ctatagataa 


tatatgtgct 


gtgtgtaatt 


ttaaatttct 


ccaaccgtgc 


tgtacaaatg 


1897 
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tgtggatttg 


taatcaggct 


attttgttgt 


tgttgttgtt 


gttcagagcc 


attaatataa 


1957 


tatttaaagt 


tgagttcact 


ggataagttt 


ttcatcttgc 


ccaaccattt 


ctaactgcca 


2017 


aattgaattc 


aagaaaccga 


tgtgggtttt 


gtttcctgta 


caattatgag 


atataattct 


2077 


ttttcccatt 


gtaggtcttt 


tacaaaacaa 


gaaaataatt 


tatttttttg 


ttggtggata 


2137 


aagaagtcaa 


gtatctgata 


ctttttattt 


acaaagtgtg 


atggttttgt 


atagtaggtt 


2197 


ccaccctgag 


tattcctaaa 


agaaaaaaaa 


aaaaaaagct 


taaaaactct 


aacttcatct 


2257 


gtgtttgtct 


tacgtggtct 


taatcgttgt 


acttacctta 


aaataaaccc 


atgttgtttt 


2317 


ttctgcccaa 


agtttggaca 


gtgtgtttgt 


gttgttgcat 


tttttacaaa 


cgaggtgtgt 


2377 


ttgcaaaccc 


acctgctttg 


attatttttg 


ttacacaggt 


gggtatatgt 


gtagacacat 


2437 


aaaaacgacc 


agagaatagg 


agcacacacc 


tgctgtcttg 


tttagtgaca 


gaaaaaggct 


2497 


tttgattaat 


tttaaaatcc 


cactctagga 


ttttttcttt 


tcgagaaacc 


gcccagttgg 


2557 


agggggctgc 


ctgaaggacc 


ggaccatgag 


tttgccgtga 


tgcattttct 


taaatgcaca 


2617 


aaaacatgct 


aattgtcaaa 


acaaacagtg 


ccactccatc 


tcagtgtcca 


gccgtcccca 


2677 


gtttaggagg 


tgaaggaagg 


gaagaataaa 


catttcccgt 


ttgctaactg 


caacccaggg 


2737 


tgagtcctgc 


tttcccccga 


ttttataaaa 


tttgagcctc 


tttgcctgct 


ttaatagttt 


2797 


tccagagaat 


ttgaactggg 


ccaatgaagg 


tctgaagggg 


acggattttc 


tagcgtttga 


2857 


tatccatccc 


ccttagcggc 


cagatcagag 


gggaatttca 


gactttatta 


cttctcaatg 


2917 


tcatgtctaa 


atctacaccc 


tcatcgcagt 


gaaaaatttt 


aaaacctcat 


tacccttcaa 


2977 


aaataattta 


tgatattttt 


agagttctaa 


attcaagttt 


ttcaatatgt 


taaataatag 


3037 


agattatttt 


ttgttttcaa 


tgttaatatc 


tcgtctttta 


catttttaat 


agtaacatag 


3097 


tttttgtgaa 


atgtagctga 


cgaaatggct 


ttattatcta 


tttcaatggc 


tgaagtccac 


3157 


cactcccctg 


ctggcctcta 


tgtgtgaatt 


tggggaccaa 


agcttcatca 


attcccaccc 


3217 


cagcaggtga 


gctgtacctt 


gctaatgctg 


aagttctttg 


tgagcttaac 


gtttcaagac 


3277 


cagatgattt 


tgctaaaggt 


gattttgctt 


gatgcagtgg 


cgctgaacgt 


aacccgggtg 


3337 


tttttgtcgt 


gttgttttca 


acatggcact 


ttatctccac 


gctatgttga 


aatagaatta 


3397 


ggggaagctt 


aaagcataat 


aattgtcccc 


acatgtgcaa 


cacagactct 


ttcaatctgt 


3457 


ggccccagag 


gtggcacaca 


gttaagactt 


ggcggctgtc 


tcattctttt 


tcataatgtg 


3517 


cgggttcccg 


ggtgtccggg 


tgctagactt 


tcagcaggcc 


ccaggccaga 


cgggctttgg 


3577 


ttgagtgaac 


aggaggagga 


agttaaggag 


gtaggggtgg 


ggagagaccc 


tctccaagct 


3637 


gcagaagaag 


gtggcccaag 


ctccttgcct 


gcgtctgccg 


tgatggtttc 


attttacttc 


3697 



tgctcgcttc atgctatttg ccccaggaga agaggagagt attccagacg gtaagcgagc 3757 

tggctttttc ccttccctag acgttttaaa gaaatctttc tgaaagcttg ccctcatcgt 3817 

aagctttgaa accgttggtg tcctgttagt ggcgagggct gagagacacg cggagaaata 3877 

aaggagagcg acggtgtggc tgagagcccc caggtctgct gttgaaacta agctgggctt 3937 

ttgcaccttt aggaagcctt tttaaagaag tcctgctgtg tgggggccgg aagcccaagt 3997 

gagtgggcct tgtggaggtt atcgggaggg gtctttacca ctccttgggg aacgtgggca 4057 

acggggggat tgtatctgaa gctttattca ggtcttcggc ggcagcagag tggagaacca 4117 

ggcccttagt gtgtagcggc ctggggattt tgggactcat c 4158 

<210> 4 
<211> 417 
<212> PRT 

<213> Homo sapiens 
<400> 4 

Met Asn Ser Gin Glu Gly Cys Lys Glu Thr Leu Asn Gly He Tyr Gin 
15 10 15 

Phe He Met Asp Arg Phe Pro Phe Tyr Arg Glu Asn Lys Gin Gly Trp 
20 25 30 

Gin Asn Ser lie Arg His Asn Leu Ser Leu Asn Glu Cys Phe Val Lys 
35 40 45 

Val Pro Arg Asp Asp Lys Lys Pro Gly Lys Gly Ser Tyr Trp Thr Leu 
50 55 60 

Asp Pro Asp Ser Tyr Asn Met Phe Glu Asn Gly Ser Phe Leu Arg Arg 
65 70 75 80 

Arg Arg Arg Phe Lys Lys Lys Asp Val Ser Lys Glu Lys Glu Glu Arg 
85 90 95 

Ala His Leu Lys Glu Pro Pro Pro Ala Ala Ser Lys Gly Ala Pro Ala 
100 105 110 

Thr Pro His Leu Ala Asp Ala Pro Lys Glu Ala Glu Lys Lys Val Val 
115 120 125 

He Lys Ser Glu Ala Ala Ser Pro Ala Leu Pro Val He Thr Lys Val 
130 135 140 

Glu Thr Leu Ser Pro Glu Ser Ala Leu Gin Gly Ser Pro Arg Ser Ala 
145 150 155 160 

Ala Ser Thr Pro Ala Gly Ser Pro Asp Gly Ser Leu Pro Glu His His 
165 170 175 

Ala Ala Ala Pro Asn Gly Leu Pro Gly Phe Ser Val Glu Asn He Met 
180 185 190 



Thr Leu Arg Thr Ser Pro Pro Gly Gly Glu Leu Ser Pro Gly Ala Gly 
195 200 205 

Arg Ala Gly Leu Val Val Pro Pro Leu Ala Leu Pro Tyr Ala Ala Ala 
210 215 220 

Pro Pro Ala Ala Tyr Gly Gin Pro Cys Ala Gin Gly Leu Glu Ala Gly 
225 230 235 240 

Ala Ala Gly Gly Tyr Gin Cys Ser Met Arg Ala Met Ser Leu Tyr Thr 
245 250 255 

Gly Ala Glu Arg Pro Ala His Met Cys Val Pro Pro Ala Leu Asp Glu 
260 265 270 

Ala Leu Ser Asp His Pro Ser Gly Pro Thr Ser Pro Leu Ser Ala Leu 
275 280 285 

Asn Leu Ala Ala Gly Gin Glu Gly Ala Leu Ala Ala Thr Gly His His 
290 295 300 

His Gin His His Gly His His His Pro Gin Ala Pro Pro Pro Pro Pro 
305 310 315 320 

Ala Pro Gin Pro Gin Pro Thr Pro Gin Pro Gly Ala Ala Ala Ala Gin 
325 330 335 

Ala Ala Ser Trp Tyr Leu Asn His Ser Gly Asp Leu Asn His Leu Pro 
340 345 350 

Gly His Thr Phe Ala Ala Gin Gin Gin Thr Phe Pro Asn Val Arg Glu 
355 360 365 

Met Phe Asn Ser His Arg Leu Gly lie Glu Asn Ser Thr Leu Gly Glu 
370 375 380 

Ser Gin Val Ser Gly Asn Ala Ser Cys Gin Leu Pro Tyr Arg Ser Thr 
385 390 395 400 

Pro Pro Leu Tyr Arg His Ala Ala Pro Tyr Ser Tyr Asp Cys Thr Lys 
405 410 415 



Tyr 
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<211> 6021 
<212> DNA 
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<220> 

<221> exon 
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<300> 
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<309> 1997-05-14 
<300> 

<301> Miura, N 
<303> Genomics 
<304> 41 
<306> 489-492 
<307> 1997 



<400> 5 

ctcgagtcaa aggtagcaca cataaaacct attttgctgc ttcggtacgt caagcaatgc 60 
cactaaagtt tcctcacccg ccaaagctga aacagtgagt tctaatctct caaagccttt 120 
tgccgaaaat ctaaaggggg tggggggcta tggtggtggc gtgggggggg ggtcggagaa 180 
gaagaaagac tgagacaaat gttttatctg tcgccttctt ccctacccaa ccggaccaac 240 
aacttccaga aggttctgcg aggcatagag ccattccgta gggacatctc ggtgcttctg 300 
aggaagcgga ccgagcaggg atccgatgac gactggagat gttgaaggaa taaataccag 360 
tccacaaata aacaaactgt ccccgggatt cctagaggga aggagcacgc ttgaaggtcg 420 
gggaactccg agtcgctgtg cgtcaaggtt ggcataaaat taaaaaaaaa aaaagtcctt 480 
cagttaccag gccctctaag gagcccctgg tcctcagctc accttatcaa aactcagtaa 540 
aacaaacagc ctgaaataca gtcaatttac aggatcccaa agatgctgac cgcggagtgg 600 
gacccacgcc gggccccggc aacagctagg gaagcgggtc cgaggctaca cagtgccgcg 660 
ctcctttgcg tttccagtga cgaagccggc gatggagtgc aggcttggag ctccccacgc 720 
cgaacgggga caccagctcc cgggggctgg ctgccttgtc ctaacctcca gacagcgctt 780 
tcataggtgg ggagaaggga gaggccggga tggatggcag ggaaagctag ccctcgtcta 840 
tgcgggagag gagaccagga aagcaacagt tgggttcacg cgcttccctg aaccccacga 900 
aattgtttgg aggactcaga tggatcacct aagtagcagc gaagacgaag gaccaatggt 960 
tccttaggtg ttaccttccc agtttggcat tcccactaag ccttccctcc cagcccgacc 1020 
ccgtcgtgaa ggggagagga accgaattct ccaacccggc ctcctttgtg ggctcttcct 1080 
caacctggaa gcgtcctgtg aattatccat cactgcattc aacaggccct acacgctcag 1140 
tccgtttgct ctgaacccat tacaactagg ccccgataat taagaaatct aattattcgc 1200 
ctcttcatcc attaataata ataaaaaaaa aatctccagg ctctttccta cttacaaggt 1260 
cttgggggca aatctctgcc caacttcatc aattcgatgt tatatttcaa actaaacttc 1320 
tttttatttt ccaaaggaac agggttttta atttttgctc tggacacgtg gtctcgttaa 1380 
acaaaatgtg ataataaaat aaaattttat aagatgtaac tcatttttaa aagtcctcaa 1440 
gttaacttga gctggggggg ggggagatct ggctaagagc atctgggtct tagagccgac 1500 
ggattcaggc gctcctcgtt ttgattggtg ccatccttct cgcagctgcc agatgattgg 1560 
tgcaaacttc ctggaggggg cgcggcctga agaaagtaaa aactcgcttt gagccagaag 1620 
acttttgaaa cttttcccaa tccctaaaag ggactttgct tctttttccg ggctcggccg 1680 
cgcagcctct ccggacccta gctcgctgac gctgcgggct gcagttctcc tggcggggcc 174 0 
cgagagccgc tgtctccttt tctagcactc ggaagggctg gtgtcgctcc acggtcgcgc 1800 
gtggcgtctg tgccgccagc tcagggctgc cacccgccaa gccgagagtg cgcggccagc 1860 
ggggccgcct gccgtgcacc cttcaggatg ccgatccgcc cggtcggctg aacccgagcg 1920 
ccggcgtctt ccgcgcgtgg accgcgaggc tgccccgagt cggggctgcc tgcatcgctc 1980 
cgtcccttcc tgctctcctg ctccgggcct cgctcgccgc gggccgcagt cggtgcgcgc 2040 
aggcggcgac cgggcgtctg ggacgcagca tgcaggcgcg ttactcggta tcggacccca 2100 
acgccctggg agtggtaccc tatttgagtg agcaaaacta ctaccgggcg gccggcagct 2160 
acggcggcat ggccagcccc atgggcgtct actccggcca cccggagcag tacggcgccg 2220 
gcatgggccg ctcctacgcg ccctaccacc accagcccgc ggcgcccaag gacctggtga 2280 
agccgcccta cagctatata gcgctcatca ccatggcgat ccagaacgcg ccagagaaga 2340 
agatcactct gaacggcatc taccagttca tcatggaccg tttccccttc taccgcgaga 2400 
acaagcaggg ctggcagaac agcatccgcc acaacctgtc actcaatgag tgcttcgtga 24 60 
aagtgccgcg cgacgacaag aagccgggca agggcagcta ctggacgctc gacccggact 2520 
cctacaacat gttcgagaat ggcagcttcc tgcggcggcg gcggcgcttc aagaagaagg 2580 
atgtgcccaa ggacaaggag gagcgggccc acctcaagga gccgccctcg accacggcca 264 0 
agggcgctcc gacagggacc ccggtagctg acgggcccaa ggaggccgag aagaaagtcg 2700 
tggttaagag cgaggcggcg tcccccgcgc tgccggtcat caccaaggtg gagacgctga 27 60 
gccccgaggg agcgctgcag gccagtccgc gcagcgcatc ctccacgccc gcaggttccc 2820 
cagacggctc gctgccggag caccacgccg cggcgcctaa cgggctgccc ggcttcagcg 2880 
tggagaccat catgacgctg cgcacgtcgc ctccgggcgg cgatctgagc ccagcggccg 2940 
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cgcgcgccgg 
cttacacgca 
gtatgcgggc 
ccgcgctgga 
tcaacctcgc 
acggccacct 
ccgccaccca 
gccacacgtt 
accggctagg 
agctgcccta 
gcaccaaata 
ctccatggga 
aaccaggagc 
tctcagcgag 
taattccttc 
tggaccaaac 
gtctctccgg 
tcttgcccag 
tgcccaactg 
tcaccatgat 
tttatttttt 
gtgcatggct 
taaaatttca 
ataaacccgt 
aaaagggaaa 
acgtggatgc 
cttgatttgt 
gaaactgttc 
ttaacgaaat 
caccctgtct 
ggcctgcagc 
tttccttaga 
agagggttgg 
ttcagtctga 
agaagccact 
atgctaagta 
aaaaaggaaa 
cttaccaatc 
gcatatcatg 
gcacagaaca 
catctgctta 
tccaagtttt 
ccctacccgt 
cagtatcttc 
ggccaggtgt 
ctggttcctc 
agaagcgagc 
cctgaaagct 
tttgtggatg 
gagaagcccc 
gcttttccca 
ctggggtgtt 



cctggtggtg 
gccgtgcgcg 
tatgagtctg 
cgaggctctg 
agcgggtcag 
ccacccgcag 
ggccacctcc 
tgcaacccaa 
actggacaac 
tcgagctacg 
ctgaggctgt 
accttcttcg 
agagagctcc 
tccctctaag 
ccctacccag 
ccatagggac 
ataaggtgcc 
agcctttaat 
ttactgccaa 
aaaataggtc 
tgttgttgtt 
ttgtacagta 
atctcacctg 
gttgtttttc 
ttgtagtaag 
atatacaggt 
tgaactatcc 
tccatccaca 
gctttgcggg 
cggtgtccag 
ttgctaacct 
tgcggacttg 
taggtctctg 
tttatttctt 
gtgcgcctcc 
acaggagatt 
tagatcggga 
tgctgcctga 
gattcccacg 
tgtaggccag 
gcttagtggt 
atatctgtgc 
gtatgtaaga 
cataaagtgg 
atcttggttc 
ccaacactgg 
tttgtccaag 
tgccctcctc 
ggagcttttt 
ctggagcagg 
ggcctcccag 
gcttcctcga 



ccaccgctgg 
cagggcctgg 
tacaccgggg 
tcggaccacc 
gagggcgcgt 
gcgccaccgc 
tggtatctga 
cagcaaactt 
tcgtccctcg 
ccgtccctct 
ccagtccgct 
acggagccgc 
gtgcaactcg 
ggggatgcag 
atgctgcgcc 
ccctaatgac 
ttctgtaaac 
ataatattta 
attgaattca 
cctccccaaa 
ggataacgaa 
gatgccatct 
tgtttgtctt 
tgcccaaagt 
ccagttgtga 
tacaggacga 
cgtcctgaga 
cacggacagg 
atgcagaaaa 
ctgtcctctg 
cagcgtagca 
ttgcccctgt 
gtatttaact 
aatttgggct 
agcatgatat 
atttttcttt 
caaactctct 
aagatacagc 
ccagttggta 
gaggaggcag 
ggccacgggt 
tgttttgatg 
cagtctttca 
ggrggactaag 
ctgagcagag 
tttcattttg 
ccagctggct 
ttaagattca 
tttaaagagg 
ccctacttgt 
agcagcggtg 
9 



cactgccata 
aggctgcggg 
ccgagcggcc 
cgagcggccc 
tgggggcctc 
ccgccccgca 
accacggcgg 
tccccaacgt 
gggagtccca 
accgccacgc 
ccagccccag 
agaaagcgac 
caggtaactt 
cccagcaaaa 
tgctcccttg 
ttctgtggag 
gagtgcggat 
aagttgtgtc 
agaaacgtgt 
ctgtaggtct 
attaagtatc 
ggggtattcc 
atgtgatctc 
tcggacagag 
ttgatttttg 
tggagctctc 
tatttttgtt 
gctgcctgag 
ctgttgccaa 
ttagagggga 
ggagcctggg 
tggcgtttta 
gccggctttg 
ttaaatattt 
tttagcgctg 
tgattcttgt 
aaaatgtacc 
ttcagcacag 
acctggactg 
ggacccggga 
taacacgtat 
tagaatttgg 
acctgcagtg 
aactggacag 
cagagagctt 
catggctctc 
cgctcctttc 
gaactcctga 
accgttctcg 
gactgtcagg 
tgaaaaaatg 



cgccgcagcg 
ctccgcgggc 
cgcgcacgtg 
cggctccccg 
gggtcaccac 
gccccctccc 
ggacctgagc 
ccgggagatg 
ggtgagcaat 
agccccctac 
gaccgcaccg 
ggaaagcgcc 
atccgcagct 
cgaaatacag 
gggcttcata 
attctccacg 
ttgtaaccag 
cactggataa 
gtgggtcttt 
tttacaaaac 
ggatactttt 
aaaaacacac 
agtgttgtat 
tctttgtgtt 
tgatgcaggt 
gattagtaat 
ttctgctcga 
ggcaacgtcc 
ttgtcaaaac 
gaaaccgaga 
tgagtgctcg 
agagtgccag 
ggatcagatt 
tactccggcg 
aaatggctct 
atttcatttc 
tggctggctg 
gcctgcgtgt 
tgctaatgga 
ggggggtgga 
atagtgttac 
ggaggttcct 
ccagaatgtg 
gggtgctgtg 
aggaaggggt 
ttcaaacctc 
ccagatgttt 
cccagggaaa 
ttctcaagta 
gaacccaggt 
cggtcctggg 



ccacccgccg 
taccagtgca 
tgcgttccgc 
ctcggcgccc 
caccagcatc 
gcgccgcagc 
cacctccccg 
ttcaactcgc 
gcgagctgtc 
tcttacgact 
gcttcgcctc 
cctctctcag 
cagtttgaga 
attttttttt 
gattagctta 
ggcgcaagag 
gctattttgt 
ggtttcgtct 
tctccccacg 
aagaaaataa 
aatttaggaa 
caaaagactt 
ttaccttaaa 
cttgaatttt 
tggcctggta 
agaaggggct 
ggtaatctga 
tgctggcctg 
aaaatggtgt 
aaggacaaac 
gctccctcca 
caagaagcaa 
agaagtgaat 
tggtggaaaa 
ggttttcagc 
tttaaaaaaa 
gggtggggtc 
tggactttag 
agttctttct 
ctttgcaggt 
tgtttgaaac 
gatgatacta 
acccacactt 
qaqqqgqqca. 
cgggagatct 
ttgccccagg 
taggggcctc 
gataggaggc 
ggtagctaga 
tgtgttgtag 
aaaagttggt 



3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 
5880 
5940 
6000 
6021 



<210> 6 
<211> 2712 
<212> DNA 

<213> Mus musculus 



<220> 
<221> CDS 

<222> (422) . . (1906) 



<300> 

<308> GenBank/Y08222 
<309> 1997-05-14 

<300> 

<301> Miura, N 

<303> Genomics 

<304> 41 

<306> 489-492 

<307> 1997 

<400> 6 

agggactttg cttctttttc cgggctcggc cgcgcagcct ctccggaccc tagctcgctg 60 

acgctgcggg ctgcagttct cctggcgggg cccgagagcc gctgtctcct tttctagcac 120 

tcggaagggc tggtgtcgct ccacggtcgc gcgtggcgtc tgtgccgcca gctcagggct 180 

gccacccgcc aagccgagag tgcgcggcca gcggggccgc ctgccgtgca cccttcagga 24 0 

tgccgatccg cccggtcggc tgaacccgag cgccggcgtc ttccgcgcgt ggaccgcgag 300 

gctgccccga gtcggggctg cctgcatcgc tccgtccctt cctgctctcc tgctccgggc 360 

ctcgctcgcc gcgggccgca gtcggtgcgc gcaggcggcg accgggcgtc tgggacgcag 420 

c atg cag gcg cgt tac teg gta teg gac ccc aac gec ctg gga gtg gta 4 69 
Met Gin Ala Arg Tyr Ser Val Ser Asp Pro Asn Ala Leu Gly Val Val 
15 10 15 

ccc tat ttg agt gag caa aac tac tac egg gcg gec ggc age tac ggc 517 
Pro Tyr Leu Ser Glu Gin Asn Tyr Tyr Arg Ala Ala Gly Ser Tyr Gly 
20 25 30 

ggc atg gec age ccc atg ggc gtc tac tec ggc cac ccg gag cag tac 565 
Gly Met Ala Ser Pro Met Gly Val Tyr Ser Gly His Pro Glu Gin Tyr 
35 40 45 

ggc gec ggc atg ggc cgc tec tac gcg ccc tac cac cac cag ccc gcg 613 
Gly Ala Gly Met Gly Arg Ser Tyr Ala Pro Tyr His His Gin Pro Ala 
50 55 60 

gcg ccc aag gac ctg gtg aag ccg ccc tac age tat ata gcg etc ate 661 
Ala Pro Lys Asp Leu Val Lys Pro Pro Tyr Ser Tyr lie Ala Leu lie 
65 70 75 80 

acc atg gcg ate cag aac gcg cca gag aag aag ate act ctg aac ggc 709 
Thr Met Ala lie Gin Asn Ala Pro Glu Lys Lys He Thr Leu Asn Gly 
85 90 95 



ate tac cag ttc ate atg gac cgt ttc ccc ttc tac cgc gag aac aag 
He Tyr Gin Phe He Met Asp Arg Phe Pro Phe Tyr Arg Glu Asn Lys 
100 105 110 



757 



.1 
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cag ggc tgg cag aac age ate cgc cac aac ctg tea etc aat gag tgc 
Gin Gly Trp Gin Asn Ser lie Arg His Asn Leu Ser Leu Asn Glu Cys 
115 120 125 



805 



ttc gtg aaa gtg ccg cgc gac gac aag aag ccg ggc aag ggc age tac 
Phe Val Lys Val Pro Arg Asp Asp Lys Lys Pro Gly Lys Gly Ser Tyr 
130 135 140 



853 



tgg acg etc gac ccg gac tec tac aac atg ttc gag aat ggc age ttc 
Trp Thr Leu Asp Pro Asp Ser Tyr Asn Met Phe Glu Asn Gly Ser Phe 
145 150 155 160 



901 



ctg egg egg egg egg cgc ttc aag aag aag gat gtg ccc aag gac aag 
Leu Arg Arg Arg Arg Arg Phe Lys Lys Lys Asp Val Pro Lys Asp Lys 
165 170 175 



949 



gag gag egg gee cac etc aag gag ccg ccc teg ace acg gee aag ggc 
Glu Glu Arg Ala His Leu Lys Glu Pro Pro Ser Thr Thr Ala Lys Gly 
180 185 190 



997 



get ccg aca ggg ace ccg gta get gac ggg ccc aag gag gec gag aag 
Ala Pro Thr Gly Thr Pro Val Ala Asp Gly Pro Lys Glu Ala Glu Lys 
195 200 205 



1045 



aaa gtc gtg gtt aag age gag gcg gcg tec ccc gcg ctg ccg gtc ate 
Lys Val Val Val Lys Ser Glu Ala Ala Ser Pro Ala Leu Pro Val lie 
210 215 220 



1093 



ace aag gtg gag acg ctg age ccc gag gga gcg ctg cag gec agt ccg 
Thr Lys Val Glu Thr Leu Ser Pro Glu Gly Ala Leu Gin Ala Ser Pro 
225 230 235 240 



1141 



cgc age gca tec tec acg ccc gca ggt tec cca gac ggc teg ctg ccg 
Arg Ser Ala Ser Ser Thr Pro Ala Gly Ser Pro Asp Gly Ser Leu Pro 
245 250 255 



1189 



gag cac cac gee gcg gcg cct aac ggg ctg ccc ggc ttc age gtg gag 
Glu His His Ala Ala Ala Pro Asn Gly Leu Pro Gly Phe Ser Val Glu 
260 265 270 



1237 



acc ate atg acg ctg cgc acg teg cct ccg ggc ggc gat ctg age cca 
Thr lie Met Thr Leu Arg Thr Ser Pro Pro Gly Gly Asp Leu Ser Pro 
275 280 285 



1285 



gcg gec gcg cgc gec ggc ctg gtg gtg cca ccg ctg gca ctg cca tac 
Ala Ala Ala Arg Ala Gly Leu Val Val Pro Pro Leu Ala Leu Pro Tyr 
290 295 300 



1333 



gee gca gcg cca ccc gec get tac acg cag ccg tgc gcg cag ggc ctg 
Ala Ala Ala Pro Pro Ala Ala Tyr Thr Gin Pro Cys Ala Gin Gly Leu 
305 310 315 320 



1381 



gag get gcg ggc tec gcg ggc tac cag tgc agt atg egg get atg agt 1429 

Glu Ala Ala Gly Ser Ala Gly Tyr Gin Cys Ser Met Arg Ala Met Ser 

325 330 335 

ctg tac acc ggg gee gag egg ccc gcg cac gtg tgc gtt ccg ccc gcg 1477 

Leu Tyr Thr Gly Ala Glu Arg Pro Ala His Val Cys Val Pro Pro Ala 
340 345 350 



ctg gac gag get ctg teg gac cac ccg age ggc ccc ggc tec ccg etc 1525 
Leu Asp Glu Ala Leu Ser Asp His Pro Ser Gly Pro Gly Ser Pro Leu 
355 360 365 

ggc gec etc aac etc gca gcg ggt cag gag ggc gcg ttg ggg gec teg 1573 
Gly Ala Leu Asn Leu Ala Ala Gly Gin Glu Gly Ala Leu Gly Ala Ser 
370 375 380 

ggt cac cac cac cag cat cac ggc cac etc cac ccg cag gcg cca ccg 1621 
Gly His His His Gin His His Gly His Leu His Pro Gin Ala Pro Pro 
385 390 395 400 

ccc gec ccg cag ccc cct ccc gcg ccg cag ccc gec acc cag gee acc 1669 
Pro Ala Pro Gin Pro Pro Pro Ala Pro Gin Pro Ala Thr Gin Ala Thr 
405 410 415 

tec tgg tat ctg aac cac ggc ggg gac ctg age cac etc ccc ggc cac 1717 
Ser Trp Tyr Leu Asn His Gly Gly Asp Leu Ser His Leu Pro Gly His 
420 425 430 

acg ttt gca acc caa cag caa act ttc ccc aac gtc egg gag atg ttc 17 65 
Thr Phe Ala Thr Gin Gin Gin Thr Phe Pro Asn Val Arg Glu Met Phe 
435 440 445 

aac teg cac egg eta gga ctg gac aac teg tec etc ggg gag tec cag 1813 
Asn Ser His Arg Leu Gly Leu Asp Asn Ser Ser Leu Gly Glu Ser Gin 
450 455 460 

gtg age aat gcg age tgt cag ctg ccc tat cga get acg ccg tec etc 1861 
Val Ser Asn Ala Ser Cys Gin Leu Pro Tyr Arg Ala Thr Pro Ser Leu 
465 470 475 480 

tac cgc cac gca gec ccc tac tct tac gac tgc acc aaa tac tga 1906 
Tyr Arg His Ala Ala Pro Tyr Ser Tyr Asp Cys Thr Lys Tyr 

485 490 495 

ggctgtccag tccgctccag ccccaggacc gcaccggctt cgcctcctcc atgggaacct 1966 

tettcgaegg ageegcagaa agegaeggaa agcgcccctc tctcagaacc aggagcagag 202 6 

agctccgtgc aactegcagg taacttatcc gcagctcagt ttgagatctc agcgagtccc 208 6 

tctaaggggg atgcagccca gcaaaacgaa atacagattt tttttttaat tccttcccct 214 6 

acccagatgc tgcgcctgct cccttggggc ttcatagatt agcttatgga ccaaacccat 2206 

agggacccct aatgacttct gtggagattc tccacgggcg caagaggtct etceggataa 2266 

ggtgccttct gtaaacgagt gcggatttgt aaccaggcta ttttgttctt gcccagagcc 2326 

tttaatataa tatttaaagt tgtgtccact ggataaggtt tcgtcttgcc caactgttac 2386 

tgccaaattg aattcaagaa acgtgtgtgg gtcttttctc cccacgtcac catgataaaa 2446 

taggtccctc cccaaactgt aggtctttta caaaacaaga aaataattta tttttttgtt 2506 

gttgttggat aacgaaatta agtateggat acttttaatt taggaagtgc atggctttgt 2566 



-18- 



acagtagatg ccatctgggg tattccaaaa acacaccaaa agactttaaa atttcaatct 2626 
cacctgtgtt tgtcttatgt gatctcagtg ttgtatttac cttaaaataa acccgtgttg 2686 
tttttctgcc caaaaaaaaa aaaaaa 2712 
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8 



gaattcggag gattaagttg tcagtcagca cgttgctacc ttcccctcta 
tgcctggctc ctcggcgggg agcgagggaa actcagtttg tagggtttac 
tcgataggtt atccttgacg accccgagcc tggaaactcc ctgttgatga 
gattaaataa gtataacatc caggagaggc cctgccattc caatccagcg 
tgaatccatt acacctgggc ccccataatt aggaaatcta attattcgct 
ttaataagaa aaatgtccca ggatcattgc tacttacaag gtctttggga 
actctattaa tccattctat tttatatttc aaattgattt tttttaacag 
ctatcttttt gttttgggca tgtgggccca ttcaccaaaa tgtgatcata 
taataagata taacttttta aaaagttttc aagtgaagac ggagtcgccg 
ggcggcgggg tcttagagcc gacggattcc tgcgctcctc gccccgattg 
cctctcagct gccgggtgat tggctcaaag ttccgggagg gggcgtggcc 
aaaaactcgc tttcagcaag aagacttttg aaacttttcc caatccctaa 
gcctcttttt ctgggctcag cggggcagcc gctcggaccc cggcgcgctg 
ctgccgattc gctgggggct tggagagcct cctgcgcccc tcctcgcgcg 
ccaccttggt ccccaggccg cggcgtctcc gctgggtccg cggccgcccg 
ctgccgccgc cgggtcctgg agccagcgag gagcggggcc ggcgctgcgc 
cgcgccctcc aggatgccga tccgcccggt ccgctgaaag cgcgcgcccc 
gagcgacgac gaccgcgcac cctcgccccg gaggctgcca ggagaccggg 
ccgctcccct cctctccccc tctggctctc tcgcgctctc tcgctctcag 
gctcccccgg ccgcagtccg tgcgcgaggg cgccggcgag ccgtctcgga 
aggcgcgcta ctccgtgtcc gaccccaacg ccctgggagt ggtgccctac 
agaattacta ccgggctgcg ggcagctacg gcggcatggc cagccccatg 
ccggccaccc ggagcagtac agcgcgggga tgggccgctc ctacgcgccc 



tgcactccgc 60 
ctctaaaacc 120 
ttaattattt 180 
cgtttgcttt 240 
tcatcactca 300 
gagatatttt 360 
aggaaagtgg 420 
aaataaattt 480 
cggaggccgg 54 0 
gcgccggact 600 
cgaggaaagt 660 
aagggacttg 720 
accctcgggg 780 
ggccgagggt 840 
cctgcccgcg 900 
ttgcccgggg 960 
tgctcggccc 1020 
gccgcccctc 1080 
ggcccccctc 1140 
agcagcatgc 1200 
ctgagcgagc 12 60 
ggcgtctatt 1320 
taccaccacc 1380 



accagcccgc ggcgcctaag gacctggtga 
ccatggccat ccagaacgcg cccgagaaga 
tcatggaccg cttccccttc taccgggaga 
acaacctctc gctcaacgag tgcttcgtca 
agggcagtta ctggacqctg gacccggact 
tgcggcgccg gcggcgcttc aaaaagaagg 
acctcaagga gccgcccccg gcggcgtcca 
acgcccccaa ggaggccgag aagaaggtgg 
tgccggtcat caccaaggtg gagacgctga 
gcagcgcggc ctccacgccc gccggctccc 
cggcgcccaa cgggctgcct ggcttcagcg 
cgccgggcgg agagctgagc ccgggggccg 
cgctgccata cgccgccgcg ccgcccgccg 
aggccggggc cgccgggggc taccagtgca 
ccgagcggcc ggcgcacatg tgcgtcccgc 
cgagcggccc cacgtcgccc ctgagcgctc 
tcgccgccac gggccaccac caccagcacc 
ccccgccggc tccccagccc cagccgacgc 
cctcctggta tctcaaccac agcggggacc 
cccagcagca aactttcccc aacgtgcggg 
agaactcgac cctcggggag tcccaggtga 
gatccacgcc gcctctctat cgccacgcag 
gacgtgtccc gggacctccc ctccccggcc 
aaccagacaa ttaaggggct gcagagacgc 
tttctcagac ccgggagcag agagcgggca 
aggtaacttt aattcgccgc cccgtttctg 
agcccaacaa aatgagtatt ggtcttaaaa 
tgctcgacct gagctttcaa aagttaagtt 
tgactttctg taggggtccc cataggtgta 
tgtgtaattt taaatttctc caaccgtgct 
ttttgttgtt gttgttgttg ttcagagcca 
gataagtttt tcatcttgcc caaccatttc 



agccgcccta cagctacatc gcgctcatca 144 0 
agatcacctt gaacggcatc taccagttca 1500 
acaagcaggg ctggcagaac agcatccgcc 1560 
aggtgccccg cgacgacaag aagcccggca 1620 
cctacaacat gttcgagaac ggcagcttcc 1680 
acgtgtccaa ggagaaggag gagcgggccc 174 0 
agggcgcccc ggccaccccc cacctagcgg 1800 
tgatcaagag cgaggcggcg tccccggcgc 1860 
gccccgagag cgcgctgcag ggcagcccgc 1920 
ccgacggttc gctgccggag caccacgccg 1980 
tggagaacat catgaccctg cgaacgtcgc 204 0 
gacgcgcggg cctggtggtg ccgccgctgg 2100 
cctacggcca gccgtgcgct cagggcctgg 2160 
gcatgcgagc gatgagcctg tacaccgggg 2220 
ccgccctgga cgaggccctc tcggaccacc 2280 
tcaacctcgc cgccggccag gagggcgcgc 2340 
acggccacca ccacccgcag gcgccgccgc 2400 
cgcagcccgg ggccgccgcg gcgcaggcgg 24 60 
tgaaccacct ccccggccac acgttcgcgg 2520 
agatgttcaa ctcccaccgg ctggggattg 2580 
gtggcaatgc cagctgccag ctgccctaca 2640 
ccccctactc ctacgactgc acgaaatact 2700 
cgctccggct tcgcttccca gccccgaccc 27 60 
aaaaaagaaa caaaacatgt ccaccaacct 2820 
cgctagcccc cagccgtctg tgaagagcgc 2880 
ggatcccagg aaacccctcc aaagggacgc 2940 
tccccctccc ctaccaggac ggctgtgctg 3000 
atggacccaa atcccatagc gagcccctag 3060 
tgggggtctc tatagataat atatgtgctg 3120 
gtacaaatgt gtggatttgt aatcaggcta 3180 
ttaatataat atttaaagtt gagttcactg 3240 
taactgccaa attgaattc 3289 
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atg cag gcg cgc tac tec gtg tec gac ccc aac gec ctg gga gtg gtg 48 
Met Gin Ala Arg Tyr Ser Val Ser Asp Pro Asn Ala Leu Gly Val Val 
15 10 15 

ccc tac ctg age gag cag aat tac tac egg get gcg ggc age tac ggc 96 
Pro Tyr Leu Ser Glu Gin Asn Tyr Tyr Arg Ala Ala Gly Ser Tyr Gly 
20 25 30 

ggc atg gec age ccc atg ggc gtc tat tec ggc cac ccg gag cag tac 14 4 
Gly Met Ala Ser Pro Met Gly Val Tyr Ser Gly His Pro Glu Gin Tyr 
35 40 45 
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agc gcg ggg atg ggc cgc tec tac gcg ccc tac cac cac cac cag ccc 
Ser Ala Gly Met Gly Arg Ser Tyr Ala Pro Tyr His His His Gin Pro 
50 55 60 



192 



gcg gcg cct aag gac ctg gtg aag ccg ccc tac age tac ate gcg etc 
Ala Ala Pro Lys Asp Leu Val Lys Pro Pro Tyr Ser Tyr lie Ala Leu 
65 70 75 80 



240 



ate acc atg gee ate cag aac gcg ccc gag aag aag ate acc ttg aac 
lie Thr Met Ala lie Gin Asn Ala Pro Glu Lys Lys lie Thr Leu Asn 
85 90 95 



288 



ggc ate tac cag ttc ate atg gac cgc ttc ccc ttc tac egg gag aac 
Gly lie Tyr Gin Phe lie Met Asp Arg Phe Pro Phe Tyr Arg Glu Asn 
100 105 110 



336 



aag cag ggc tgg cag aac age ate cgc cac aac etc teg etc aac gag 
Lys Gin Gly Trp Gin Asn Ser lie Arg His Asn Leu Ser Leu Asn Glu 
115 120 125 



384 



tgc ttc gtc aag gtg ccc cgc gac gac aag aag ccc ggc aag ggc agt 
Cys Phe Val Lys Val Pro Arg Asp Asp Lys Lys Pro Gly Lys Gly Ser 
130 135 140 



432 



tac tgg acc ctg gac ccg gac tec tac aac atg ttc gag aac ggc age 
Tyr Trp Thr Leu Asp Pro Asp Ser Tyr Asn Met Phe Glu Asn Gly Ser 
145 150 155 160 



480 



ttc ctg egg cgc egg egg cgc ttc aaa aag aag gac gtg tec aag gag 
Phe Leu Arg Arg Arg Arg Arg Phe Lys Lys Lys Asp Val Ser Lys Glu 
165 170 175 



528 



aag gag gag egg gec cac etc aag gag ccg ccc ccg gcg gcg tec aag 
Lys Glu Glu Arg Ala His Leu Lys Glu Pro Pro Pro Ala Ala Ser Lys 
180 185 190 



576 



ggc gec ccg gec acc ccc cac eta gcg gac gec ccc aag gag gec gag 
Gly Ala Pro Ala Thr Pro His Leu Ala Asp Ala Pro Lys Glu Ala Glu 
195 200 205 



624 



aag aag gtg gtg ate aag age gag gcg gcg tec ccg gcg ctg ccg gtc 
Lys Lys Val Val He Lys Ser Glu Ala Ala Ser Pro Ala Leu Pro Val 
210 215 220 



672 



ate acc aag gtg gag acg ctg age ccc gag age gcg ctg cag ggc age 
He Thr Lys Val Glu Thr Leu Ser Pro Glu Ser Ala Leu Gin Gly Ser 
225 230 235 240 



720 



ccg cgc age gcg gee tec acg ccc gec ggc tec ccc gac ggt teg ctg 
Pro Arg Ser Ala Ala Ser Thr Pro Ala Gly Ser Pro Asp Gly Ser Leu 
245 250 255 



768 



ccg gag cac cac gee gcg gcg ccc aac ggg ctg cct ggc ttc age gtg 
Pro Glu His His Ala Ala Ala Pro Asn Gly Leu Pro Gly Phe Ser Val 
260 265 270 



816 
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gag aac ate atg acc ctg cga acg teg ccg ccg ggc gga gag ctg age 864 
Glu Asn lie Met Thr Leu Arg Thr Ser Pro Pro Gly Gly Glu Leu Ser 
275 280 285 

ccg ggg gee gga cgc gcg ggc ctg gtg gtg ccg ccg ctg gcg ctg cca 912 
Pro Gly Ala Gly Arg Ala Gly Leu Val Val Pro Pro Leu Ala Leu Pro 
290 295 300 

tac gec gee gcg ccg ccc gec gec tac ggc cag ccg tgc get cag ggc 960 
Tyr Ala Ala Ala Pro Pro Ala Ala Tyr Gly Gin Pro Cys Ala Gin Gly 
305 310 315 320 

ctg gag gec ggg gec gec ggg ggc tac cag tgc age atg cga gcg atg 1008 
Leu Glu Ala Gly Ala Ala Gly Gly Tyr Gin Cys Ser Met Arg Ala Met 
325 330 335 

age ctg tac acc ggg gee gag egg ccg gcg cac atg tgc gtc ccg ccc 1056 
Ser Leu Tyr Thr Gly Ala Glu Arg Pro Ala His Met Cys Val Pro Pro 
340 345 350 

gee ctg gac gag gee etc teg gac cac ccg age ggc ccc acg teg ccc 1104 
Ala Leu Asp Glu Ala Leu Ser Asp His Pro Ser Gly Pro Thr Ser Pro 
355 360 365 

ctg age get etc aac etc gec gee ggc cag gag ggc gcg etc gee gec 1152 
Leu Ser Ala Leu Asn Leu Ala Ala Gly Gin Glu Gly Ala Leu Ala Ala 
370 375 380 

acg ggc cac cac cac cag cac cac ggc cac cac cac ccg cag gcg ccg 1200 
Thr Gly His His His Gin His His Gly His His His Pro Gin Ala Pro 
385 390 395 400 

ccg ccc ccg ccg get ccc cag ccc cag ccg acg ccg cag ccc ggg gee 1248 
Pro Pro Pro Pro Ala Pro Gin Pro Gin Pro Thr Pro Gin Pro Gly Ala 
405 410 415 

gec gcg gcg cag gcg gee tec tgg tat etc aac cac age ggg gac ctg 1296 
Ala Ala Ala Gin Ala Ala Ser Trp Tyr Leu Asn His Ser Gly Asp Leu 
420 425 430 

aac cac etc ccc ggc cac acg ttc gcg gee cag cag caa act ttc ccc 134 4 
Asn His Leu Pro Gly His Thr Phe Ala Ala Gin Gin Gin Thr Phe Pro 
435 440 445 

aac gtg egg gag atg ttc aac tec cac egg ctg ggg att gag aac teg 1392 
Asn Val Arg Glu Met Phe Asn Ser His Arg Leu Gly lie Glu Asn Ser 
450 455 460 

acc etc ggg gag tec cag gtg agt ggc aat gec age tgc cag ctg ccc 1440 
Thr Leu Gly Glu Ser Gin Val Ser Gly Asn Ala Ser Cys Gin Leu Pro 
465 470 475 480 

tac aga tec acg ccg cct etc tat cgc cac gca gec ccc tac tec tac 1488 
Tyr Arg Ser Thr Pro Pro Leu Tyr Arg His Ala Ala Pro Tyr Ser Tyr 
485 490 495 

gac tgc acg aaa tac tga 1506 
Asp Cys Thr Lys Tyr 
500 
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